diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/00/5b8c9612c07b8cc074888d10fbc1dd975cdf873c4a385add8b02d886b3ff06 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/00/5b8c9612c07b8cc074888d10fbc1dd975cdf873c4a385add8b02d886b3ff06 deleted file mode 100644 index 69cf429..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/00/5b8c9612c07b8cc074888d10fbc1dd975cdf873c4a385add8b02d886b3ff06 +++ /dev/null @@ -1,656 +0,0 @@ -I"ô
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R.
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-6
-
split_states <- split(state_parks, f = state_parks$State_Nm) # split the data by state
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))
- }
-
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This is part three of my cartography in R series. If you are just finding this, I suggest taking a look at part I and part II first.
- -In this post, I will download and process the National Park data. Once that’s done, I’ll add it to the base map I created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three [this post]
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-
## load data
- states <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
to reflect wherever you saved the shifted shapefile.
If your data processing and base map creation are in the same file, you can skip this line, and when you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -The National Park Service provides all the data we’ll need to make the map. The data is accessible on the ArcGIS’ Open Data website. Once you click on the link you’ll see a bunch of icons that lead to different data that’s available for download. Click on the one for boundaries.
- - - -From here, you’ll be taken to a list of available National Park data. The second link should be nps boundary which contains the shape data for all the National Parks in the United States. The file contains all the data for the park outlines along with hiking trails, rest areas, and lots of other data.
- - - -The nps boundary link will take you to a map showing the national parks. On the left, there will be a download link on the left.
- - - -From here, you’ll have a few download options. The National Park Service provides the data in different formats including CSV and Shapefile. You’ll want to download the shapefile version.
- - - -Be sure to save the file somewhere on your hard drive that is easy to find. When it finishes downloading, be sure to unzip the file. There will be four files inside the folder. All of them need to be kept in the same location. Even though we’ll only load the .shp
file, R uses the three others to create the necessary shapes.
The code below may look intimidating, but it’s fairly straight forward. I’ll go over each line below.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- ## load and process nps data
- nps <- read_sf("./shapefiles/original/nps/NPS_-_Land_Resources_Division_Boundary_and_Tract_Data_Service.shp") %>%
- select(STATE, UNIT_TYPE, PARKNAME, Shape__Are, geometry) %>%
- filter(STATE %!in% territories) %>%
- mutate(type = case_when(UNIT_TYPE == "International Historic Site" ~ "International Historic Site", # there's 23 types of national park, I wanted to reduce this number.
- UNIT_TYPE == "National Battlefield Site" ~ "National Military or Battlefield", # lines 56-77 reduce the number of park types
- UNIT_TYPE == "National Military Park" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Battlefield" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Historical Park" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Site" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Trail" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Memorial" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Monument" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Preserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Reserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Recreation Area" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Lakeshore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Wild & Scenic River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Seashore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Trails Syste" ~ "National Trail",
- UNIT_TYPE == "National Scenic Trail" ~ "National Trail",
- UNIT_TYPE == "National Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Parkway" ~ "National Park or Parkway",
- UNIT_TYPE == "Other Designation" ~ "Other National Land Area")) %>%
- mutate(visited = case_when(PARKNAME == "Joshua Tree" ~ "visited",
- PARKNAME == "Redwood" ~ "visited",
- PARKNAME == "Santa Monica Mountains" ~ "visited",
- PARKNAME == "Sequoia" ~ "visited",
- PARKNAME == "Kings Canyon" ~ "visited",
- PARKNAME == "Lewis and Clark" ~ "visited",
- PARKNAME == "Mount Rainier" ~ "visited",
- TRUE ~ "not visited")) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted national park data
- st_write(nps, "~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
In part I of this series I talked about how R has an %in%
function, but not a %!in%
function. Here’s where the latter function shines.
The United States is still an empire with its associated territories and islands. In this project I am interested in the 50 states - without these other areas. As a result, I need to filter them out. Using base R’s %in%
function I would have to create a variable that contains the postal abbreviations for all 50 states. That is annoying. Instead, I want to use the shorter list that only includes the US’ associated islands and territories. To do so, however, I need to use the operator tools’ %!in%
function.
Line 2 creates the list of US territories that I filter out in line 7. The c()
function in R means combine or concatenate. Inside the parenthesis are the five postal codes for the American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, and the Virgin Islands.
nps <- read_sf("path/to/file.shp")
loads the National Park data set to a variable called nps
using the read_sf()
function that is part of the sf package. You will need to change the file path so it reflects where you saved the data on your hard drive.
The %>%
operator is part of the tidyverse package. It tells R to go to the next line and process the next command. It has to go at the end of a line, rather than the beginning.
select
is part of the tidyverse package. With it, we can select columns by their name rather than their associated number. Large data sets take more computing power because the computer has to iterate over more rows. Unfortunately, rendering maps also takes a lot of computing power so I like to discard any unnecessary columns to reduce the amount of effort my computer has to exert.
Deciding on which columns to keep will depend on the data you’re using and what you want to map (or analyze). I know for my project I want to include a few things:
-There’s a couple ways to inspect the data to see what kind of information is available.
- -view(nps)
but as the number of data points increases, so does R’s struggle with opening it. I’ve found that VSCode doesn’t throw as big of a fit as R Studio when opening large data sets.data.frame(colnames(nps))
. This will return a list of the data set’s column names. This is my preferred method. I then go to the documentation to see what each column contains. This isn’t fool-proof because it really depends on if the data has good documentation.The National Park data includes a lot of information about who created the data and maintains the property. I’m not interested in this, so in line 6 I select the following columns:
-The geometry column is specific to shapefiles and it includes the coordinates of the shape. It will be kept automatically - unless you use the st_drop_geometry()
function. I like to specifically select so I remember it’s there.
In line 7 I use the territories list I created in line 2 to filter out the United States’ associated areas. Since the nps data uses the two character state abbreviation, I have to use the two character abbreviation for the territories. Searching for “Guam,” for example, won’t work.
- -filter()
is part of the tidyverse and it uses conditional language. In the parentheses is a condition that must be true if the tidyverse is going to keep the row. Starting at the top of the data, R goes “alright, does the value in the STATE column match any of the values in the territories list?” If the condition is TRUE, R adds the row to the new data frame.
%!in%
operator, any row that evaluates as TRUE will be kept because the value is NOT found in the territories list. If I wanted to keep only the territories, I would use the %in%
operator and only the rows with STATE abbreviations found in the territories list would be kept. For example, if the STATE value in row 1 is CA, filter looks at it and goes “is CA NOT IN territories?” If that is TRUE, keep it because we want only the values that are NOT IN the territories list.
- -mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
The NPS data set has 23 different types of National Parks listed (you can view all of them by running levels(as.factor(nps$UNIT_TYPE))
). I know that in later posts, I’m going to color code the land by type (blue for rivers, green for national parks, etc) so I wanted to reduce the number of colors I would have to use.
mutate()
’s first argument, type =
creates a new column called type
. R will populate the newly created column with whatever comes after the first (singular) equal =
sign. For example, I can put type = NA
and every row in the column will say NA
.
Here, I am using the case_when()
function, which is also part of the tidyverse. The logic of case_when
is fairly straight forward. The first value is the name of the column you want R to look in (here: UNIT_TYPE
). Next, is a conditional. Here I am looking for an exact match (==
) to the string (words) inside the first set of quotation marks (in line 8: "International Historic Site"
). The last part of the argument is what I want R to put in the type
column when it finds a row where the UNIT_TYPE
is "International Historic Site"
.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
Lines 9-29 do the same thing for the other park types. You can reduce the parks however you want or use all 23 types. Just remember that the value before the tilde ~
has to match the values found in the data exactly. For example, in line 24 I change the NPS data’s National Trail Syste value to be National Trail. Whomever created the data set did not spell system correctly, so for R to match the value I also have to omit the last letter in system.
Lines 30-37 use the same mutate()
and case_when
logic as above. Instead of reducing the number of park types, I use it to mark the different parks I have visited.
Line 30 creates the new column, visited
and uses case_when
to look for the names of the parks that I’ve been to. If I have visited them, it adds visited
to the column of the same name.
The last line, TRUE ~ "not_visited))
, acts as an else statement. For any park not listed above, it will put not visited
in the visited
column I created.
This feels like a very brute-force method of tracking which parks I’ve visited, but I haven’t spend much time trying to find another way.
- -In part I, when I made the base map, I moved Alaska and Hawaii so they were of similar size and closer to the continental USA. For the map to display the parks correctly, I have to shift them as well.
- -I went over these two lines in part II, so I won’t go over them again here. If you want to read more about them, check out that post.
- -The last line uses the st_transform()
function from the sf package to covert the data set from NAD83 to WGS84. Leaflet requires WGS84, so be sure to include this line at the end of your data manipulation.
I covered the WGS84 ellipsoid in part I, if you want to read more about it.
- -Strictly speaking, this line isn’t necessary. You can do all your data processing in the same file where you make your map, but I prefer to separate the steps into different files.
- -As a result, I save the shifted data to my hard drive so it’s easier to load later. I usually have this line commented out (by placing #
at the start of the line) after I save it the first time. I don’t want it to save every time I run the rest of the code.
1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-
## create usa Base Map using leaflet()
- map <- leaflet() %>%
- addPolygons(data = states,
- smoothFactor = 0.2,
- fillColor = "#808080",
- fillOpacity = 0.5,
- stroke = TRUE,
- weight = 0.5,
- opacity = 0.5,
- color = "#808080",
- highlight = highlightOptions(
- weight = 0.5,
- color = "#000000",
- fillOpacity = 0.7,
- bringToFront = FALSE),
- group = "Base Map") %>%
- addPolygons(data = nps,
- smoothFactor = 0.2,
- fillColor = "#354f52",
- fillOpacity = 1,
- stroke = TRUE,
- weight = 1,
- opacity = 0.5,
- color = "#354f52",
- highlight = highlightOptions(
- weight = 3,
- color = "#fff",
- fillOpacity = 0.8,
- bringToFront = TRUE),
- group = "National Parks") %>%
- addLayersControl(
- baseGroups = "Base Map",
- overlayGroups = "National Parks",
- options = layersControlOptions(collapsed = FALSE))
-
Lines 2-16 are identical to those in part II where I created the base map. I am not going to cover these sections in detail, because I covered it previously.
- -To add the National Park data to the base map, we call addPolygons()
again. The arguments are the same as before - color, opacity, outline style - just with different values. By changing those values, we can differentiate the base map from the national park data.
Since we’re mapping the National Parks and not the states, we have to tell R where the data is located using data = nps
.
smoothFactor()
determines how detailed the park boundaries should be. The lower the number, the more detailed the shape. The higher the number, the smoother the parks will render. I usually match this to whatever I set for the base map for consistency.
Define the color and transparency of the National Parks. In a future post, I am going to change the color of each type of public land, but for now, I’ll make them all a nice sage green color #354f52
. I also want to make the parks to be fully opaque.
The next four lines (21-24) define what kind of outline the National Parks will have. I detail each of these arguments in part II of this series.
- -Briefly, I want there to be an outline to each park (stroke = TRUE
) that’s thicker weight = 1
than the outline used on the base map. I do not like the way it looks at full opacity, so I make it half-transparent (opacity = 0.5
). Finally, I want the outline color = "#354f52
to be the same color as the fill. This will matter more when I change the fill color of the parks later on.
Lines 25-28 define the National Park’s behavior on mouseover. First we have to define and initialize the highlightOptions()
function. The function take similar arguments as the addPolygons
function - both of which I go over in detail in part II.
I want to keep the mouseover behavior noticeable, but simple. To do so, I set the outline’s thickness to be weight = 3
. This will give the shape a nice border that differentiates it from the rest of the map.
color = "#fff
sets the outline’s color on mouseover only. So, when inactive, the outline color will match the fill color, but on mouseover the outline color switches to white (#fff
).
bringToFront
can either be TRUE
or FALSE
. If TRUE
, Leaflet will bring the park to the forefront on mouseover. This is useful later when we add in the state parks because national and state parks tend to be close together.
When FALSE
the shape will remain static.
Since Leaflet adds all new data to the top of the base map, I think it’s useful to group the layers together. In the next block of code, we add in some layer functionality. For now, though, I want to add the National Parks to their own group so I can hide the National Parks if I want.
- -addLayersControl
defines how layers are displayed on the final map. The function takes three arguments.
First, we have to tell Leaflet which layer should be used as the base map: baseGroups = "Base Map"
. The name in the quotations (here: "Base Map"
) has to match the name given to the layer you set in the addPolygons()
call. In line 14, I put the 50 states into a group called "Base Map"
, but you can name it anything you like.
There can be more than one base map, too. It’s not super helpful here since I shifted Alaska and Hawaii, but when using map tiles you can add multiple types of base maps that users can switch between.
- -Next, we have to define the layers that are shown on top of the base group: overlayGroups = "National Parks"
. Just like the base map, this is defined in the corresponding addPolygons
call. Here, I called the layer National Parks
in line 30.
Finally, on the map I don’t want the layers to be collapsed, so I set options = layersControlOptions(collapsed = FALSE)
. When TRUE
the map will display an icon in the top right that, when clicked, will show the available layers.
Hey, look at that! You made a base map and you added some National Park data to it. You’re a certified cartographer now!
- -In the next part IV post we’ll download and process the state park data before adding it to the map. Part V of this series we’ll add Shiny functionality and some additional markers.
- - -</figure>
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/13/bac6e2f146265aecc927e21f50544292fd950718c1028eb350f5d38c3096d1 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/13/bac6e2f146265aecc927e21f50544292fd950718c1028eb350f5d38c3096d1 deleted file mode 100644 index 0223795..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/13/bac6e2f146265aecc927e21f50544292fd950718c1028eb350f5d38c3096d1 +++ /dev/null @@ -1,587 +0,0 @@ -I"äThis is part three of my cartography in R series. If you are just finding this, I suggest taking a look at part I and part II first.
- -In this post, I will download and process the National Park data. Once that’s done, I’ll add it to the base map I created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-
## load data
- states <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
to reflect wherever you saved the shifted shapefile.
If your data processing and base map creation are in the same file, you can skip this line, and when you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -The National Park Service provides all the data we’ll need to make the map. The data is accessible on the ArcGIS’ Open Data website. Once you click on the link you’ll see a bunch of icons that lead to different data that’s available for download. Click on the one for boundaries.
- - - -From here, you’ll be taken to a list of available National Park data. The second link should be nps boundary which contains the shape data for all the National Parks in the United States. The file contains all the data for the park outlines along with hiking trails, rest areas, and lots of other data.
- - - -The nps boundary link will take you to a map showing the national parks. On the left, there will be a download link on the left.
- - - -From here, you’ll have a few download options. The National Park Service provides the data in different formats including CSV and Shapefile. You’ll want to download the shapefile version.
- - - -Be sure to save the file somewhere on your hard drive that is easy to find. When it finishes downloading, be sure to unzip the file. There will be four files inside the folder. All of them need to be kept in the same location. Even though we’ll only load the .shp
file, R uses the three others to create the necessary shapes.
The code below may look intimidating, but it’s fairly straight forward. I’ll go over each line below.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- ## load and process nps data
- nps <- read_sf("./shapefiles/original/nps/NPS_-_Land_Resources_Division_Boundary_and_Tract_Data_Service.shp") %>%
- select(STATE, UNIT_TYPE, PARKNAME, Shape__Are, geometry) %>%
- filter(STATE %!in% territories) %>%
- mutate(type = case_when(UNIT_TYPE == "International Historic Site" ~ "International Historic Site", # there's 23 types of national park, I wanted to reduce this number.
- UNIT_TYPE == "National Battlefield Site" ~ "National Military or Battlefield", # lines 56-77 reduce the number of park types
- UNIT_TYPE == "National Military Park" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Battlefield" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Historical Park" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Site" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Trail" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Memorial" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Monument" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Preserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Reserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Recreation Area" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Lakeshore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Wild & Scenic River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Seashore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Trails Syste" ~ "National Trail",
- UNIT_TYPE == "National Scenic Trail" ~ "National Trail",
- UNIT_TYPE == "National Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Parkway" ~ "National Park or Parkway",
- UNIT_TYPE == "Other Designation" ~ "Other National Land Area")) %>%
- mutate(visited = case_when(PARKNAME == "Joshua Tree" ~ "visited",
- PARKNAME == "Redwood" ~ "visited",
- PARKNAME == "Santa Monica Mountains" ~ "visited",
- PARKNAME == "Sequoia" ~ "visited",
- PARKNAME == "Kings Canyon" ~ "visited",
- PARKNAME == "Lewis and Clark" ~ "visited",
- PARKNAME == "Mount Rainier" ~ "visited",
- TRUE ~ "not visited")) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted national park data
- st_write(nps, "~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
In part I of this series I talked about how R has an %in%
function, but not a %!in%
function. Here’s where the latter function shines.
The United States is still an empire with its associated territories and islands. In this project I am interested in the 50 states - without these other areas. As a result, I need to filter them out. Using base R’s %in%
function I would have to create a variable that contains the postal abbreviations for all 50 states. That is annoying. Instead, I want to use the shorter list that only includes the US’ associated islands and territories. To do so, however, I need to use the operator tools’ %!in%
function.
Line 2 creates the list of US territories that I filter out in line 7. The c()
function in R means combine or concatenate. Inside the parenthesis are the five postal codes for the American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, and the Virgin Islands.
nps <- read_sf("path/to/file.shp")
loads the National Park data set to a variable called nps
using the read_sf()
function that is part of the sf package. You will need to change the file path so it reflects where you saved the data on your hard drive.
The %>%
operator is part of the tidyverse package. It tells R to go to the next line and process the next command. It has to go at the end of a line, rather than the beginning.
select
is part of the tidyverse package. With it, we can select columns by their name rather than their associated number. Large data sets take more computing power because the computer has to iterate over more rows. Unfortunately, rendering maps also takes a lot of computing power so I like to discard any unnecessary columns to reduce the amount of effort my computer has to exert.
Deciding on which columns to keep will depend on the data you’re using and what you want to map (or analyze). I know for my project I want to include a few things:
-There’s a couple ways to inspect the data to see what kind of information is available.
- -view(nps)
but as the number of data points increases, so does R’s struggle with opening it. I’ve found that VSCode doesn’t throw as big of a fit as R Studio when opening large data sets.data.frame(colnames(nps))
. This will return a list of the data set’s column names. This is my preferred method. I then go to the documentation to see what each column contains. This isn’t fool-proof because it really depends on if the data has good documentation.The National Park data includes a lot of information about who created the data and maintains the property. I’m not interested in this, so in line 6 I select the following columns:
-The geometry column is specific to shapefiles and it includes the coordinates of the shape. It will be kept automatically - unless you use the st_drop_geometry()
function. I like to specifically select so I remember it’s there.
In line 7 I use the territories list I created in line 2 to filter out the United States’ associated areas. Since the nps data uses the two character state abbreviation, I have to use the two character abbreviation for the territories. Searching for “Guam,” for example, won’t work.
- -filter()
is part of the tidyverse and it uses conditional language. In the parentheses is a condition that must be true if the tidyverse is going to keep the row. Starting at the top of the data, R goes “alright, does the value in the STATE column match any of the values in the territories list?” If the condition is TRUE, R adds the row to the new data frame.
%!in%
operator, any row that evaluates as TRUE will be kept because the value is NOT found in the territories list. If I wanted to keep only the territories, I would use the %in%
operator and only the rows with STATE abbreviations found in the territories list would be kept. For example, if the STATE value in row 1 is CA, filter looks at it and goes “is CA NOT IN territories?” If that is TRUE, keep it because we want only the values that are NOT IN the territories list.
- -mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
The NPS data set has 23 different types of National Parks listed (you can view all of them by running levels(as.factor(nps$UNIT_TYPE))
). I know that in later posts, I’m going to color code the land by type (blue for rivers, green for national parks, etc) so I wanted to reduce the number of colors I would have to use.
mutate()
’s first argument, type =
creates a new column called type
. R will populate the newly created column with whatever comes after the first (singular) equal =
sign. For example, I can put type = NA
and every row in the column will say NA
.
Here, I am using the case_when()
function, which is also part of the tidyverse. The logic of case_when
is fairly straight forward. The first value is the name of the column you want R to look in (here: UNIT_TYPE
). Next, is a conditional. Here I am looking for an exact match (==
) to the string (words) inside the first set of quotation marks (in line 8: "International Historic Site"
). The last part of the argument is what I want R to put in the type
column when it finds a row where the UNIT_TYPE
is "International Historic Site"
.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
Lines 9-29 do the same thing for the other park types. You can reduce the parks however you want or use all 23 types. Just remember that the value before the tilde ~
has to match the values found in the data exactly. For example, in line 24 I change the NPS data’s National Trail Syste value to be National Trail. Whomever created the data set did not spell system correctly, so for R to match the value I also have to omit the last letter in system.
Lines 30-37 use the same mutate()
and case_when
logic as above. Instead of reducing the number of park types, I use it to mark the different parks I have visited.
Line 30 creates the new column, visited
and uses case_when
to look for the names of the parks that I’ve been to. If I have visited them, it adds visited
to the column of the same name.
The last line, TRUE ~ "not_visited))
, acts as an else statement. For any park not listed above, it will put not visited
in the visited
column I created.
This feels like a very brute-force method of tracking which parks I’ve visited, but I haven’t spend much time trying to find another way.
- -In part I, when I made the base map, I moved Alaska and Hawaii so they were of similar size and closer to the continental USA. For the map to display the parks correctly, I have to shift them as well.
- -I went over these two lines in part II, so I won’t go over them again here. If you want to read more about them, check out that post.
- -The last line uses the st_transform()
function from the sf package to covert the data set from NAD83 to WGS84. Leaflet requires WGS84, so be sure to include this line at the end of your data manipulation.
I covered the WGS84 ellipsoid in part I, if you want to read more about it.
- -Strictly speaking, this line isn’t necessary. You can do all your data processing in the same file where you make your map, but I prefer to separate the steps into different files.
- -As a result, I save the shifted data to my hard drive so it’s easier to load later. I usually have this line commented out (by placing #
at the start of the line) after I save it the first time. I don’t want it to save every time I run the rest of the code.
1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-
## create usa Base Map using leaflet()
- map <- leaflet() %>%
- addPolygons(data = states,
- smoothFactor = 0.2,
- fillColor = "#808080",
- fillOpacity = 0.5,
- stroke = TRUE,
- weight = 0.5,
- opacity = 0.5,
- color = "#808080",
- highlight = highlightOptions(
- weight = 0.5,
- color = "#000000",
- fillOpacity = 0.7,
- bringToFront = FALSE),
- group = "Base Map") %>%
- addPolygons(data = nps,
- smoothFactor = 0.2,
- fillColor = "#354f52",
- fillOpacity = 1,
- stroke = TRUE,
- weight = 1,
- opacity = 0.5,
- color = "#354f52",
- highlight = highlightOptions(
- weight = 3,
- color = "#fff",
- fillOpacity = 0.8,
- bringToFront = TRUE),
- group = "National Parks") %>%
- addLayersControl(
- baseGroups = "Base Map",
- overlayGroups = "National Parks",
- options = layersControlOptions(collapsed = FALSE))
-
Lines 2-16 are identical to those in part II where I created the base map. I am not going to cover these sections in detail, because I covered it previously.
- -To add the National Park data to the base map, we call addPolygons()
again. The arguments are the same as before - color, opacity, outline style - just with different values. By changing those values, we can differentiate the base map from the national park data.
Since we’re mapping the National Parks and not the states, we have to tell R where the data is located using data = nps
.
smoothFactor()
determines how detailed the park boundaries should be. The lower the number, the more detailed the shape. The higher the number, the smoother the parks will render. I usually match this to whatever I set for the base map for consistency.
Define the color and transparency of the National Parks. In a future post, I am going to change the color of each type of public land, but for now, I’ll make them all a nice sage green color #354f52
. I also want to make the parks to be fully opaque.
The next four lines (21-24) define what kind of outline the National Parks will have. I detail each of these arguments in part II of this series.
- -Briefly, I want there to be an outline to each park (stroke = TRUE
) that’s thicker weight = 1
than the outline used on the base map. I do not like the way it looks at full opacity, so I make it half-transparent (opacity = 0.5
). Finally, I want the outline color = "#354f52
to be the same color as the fill. This will matter more when I change the fill color of the parks later on.
Lines 25-28 define the National Park’s behavior on mouseover. First we have to define and initialize the highlightOptions()
function. The function take similar arguments as the addPolygons
function - both of which I go over in detail in part II.
I want to keep the mouseover behavior noticeable, but simple. To do so, I set the outline’s thickness to be weight = 3
. This will give the shape a nice border that differentiates it from the rest of the map.
color = "#fff
sets the outline’s color on mouseover only. So, when inactive, the outline color will match the fill color, but on mouseover the outline color switches to white (#fff
).
bringToFront
can either be TRUE
or FALSE
. If TRUE
, Leaflet will bring the park to the forefront on mouseover. This is useful later when we add in the state parks because national and state parks tend to be close together.
When FALSE
the shape will remain static.
Since Leaflet adds all new data to the top of the base map, I think it’s useful to group the layers together. In the next block of code, we add in some layer functionality. For now, though, I want to add the National Parks to their own group so I can hide the National Parks if I want.
- -addLayersControl
defines how layers are displayed on the final map. The function takes three arguments.
First, we have to tell Leaflet which layer should be used as the base map: baseGroups = "Base Map"
. The name in the quotations (here: "Base Map"
) has to match the name given to the layer you set in the addPolygons()
call. In line 14, I put the 50 states into a group called "Base Map"
, but you can name it anything you like.
There can be more than one base map, too. It’s not super helpful here since I shifted Alaska and Hawaii, but when using map tiles you can add multiple types of base maps that users can switch between.
- -Next, we have to define the layers that are shown on top of the base group: overlayGroups = "National Parks"
. Just like the base map, this is defined in the corresponding addPolygons
call. Here, I called the layer National Parks
in line 30.
Finally, on the map I don’t want the layers to be collapsed, so I set options = layersControlOptions(collapsed = FALSE)
. When TRUE
the map will display an icon in the top right that, when clicked, will show the available layers.
Hey, look at that! You made a base map and you added some National Park data to it. You’re a certified cartographer now!
- -In the next part IV post we’ll download and process the state park data before adding it to the map. Part V of this series we’ll add Shiny functionality and some additional markers.
- - -</figure>
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/17/b7cba20a27e3021ce8f6ec95153175849f4d20efc196fd16f500aca543adc6 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/17/b7cba20a27e3021ce8f6ec95153175849f4d20efc196fd16f500aca543adc6 deleted file mode 100644 index ecc58f5..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/17/b7cba20a27e3021ce8f6ec95153175849f4d20efc196fd16f500aca543adc6 +++ /dev/null @@ -1,654 +0,0 @@ -I"MôWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm) # split the data by state
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The basic logic of a for loop is:
- for(x in y) {
- do something}
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
1
-2
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - - - - | - - - -read more -This is part three of my cartography in R series. If you are just finding this, I suggest taking a look at part I and part II first.
- -In this post, I will download and process the National Park data. Once that’s done, I’ll add it to the base map I created in part II.
- - - - - - | - - - -read more -Version control is helpful when you want to track your project’s changes. However, GitHub has one major (yet, understandable) shortcoming: file size. The free version of GitHub will warn you if your file is over 50MB and completely reject your push if the file is over 100MB. This is a huge problem when you’re working with shapefiles (.shp) which contain the geographic coordinates necessary for cartography. “Officially” there are three ways around GitHub’s file size limits, but I have a clear favorite.
- - - - - - | - - - -read more -Building off my post about using Git & GitHub, this post is about using the GitHub website to initialize repos and get URLs from existing repos to clone them.
- - - - - - | - - - -read more -This is part of my tutorial series on using Git and GitHub. In particular, this guide is about using GitHub’s desktop app for creating and managing repos. You can download the app here..
- - - - - | - - - -read more -Arguably one of the best things you can do before starting a PhD is invest time in learning how to properly use version control. With version control, you can track, save, and revert changes to any kind of project. There are several options available, but I’m partial to Git & GitHub. Even if you never touch a piece of code, version control is very helpful.
- - - - - - | - - - -read more -This is a continuation of my previous post where I walked through how to download and modify shape data. I also showed how to shift Alaska and Hawaii so they are closer to the continental usa.
- - - - - | - - - -read more -Twitter is a great resource for engaging with the academic community. For example, I saw this Tweet by PhD Genie asking users to name one positive skill learned during their PhD. I love this question for a number of reasons. First, it helps PhDs reframe their experience so it’s applicable outside of academia - which can help when applying to jobs. Second, it’s really cool to see what skills other people have learned during their program.
- - - - - | - - - -read more --This area will not be a real blog, in the sense that it will probably not have regular updates. -
- - - - - | - - - -read more -GET A CITATION MANAGER.
- - - - - | - - - -read more -Here you’ll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+ +I have done my best to keep my views out of it.
+ +I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+ +If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page]
+ +Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-Here you’ll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+ +I have done my best to keep my views out of it.
+ +I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+ +If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page]
+ +Click on a link to download the PDF.
+:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/1b/345711ba56a1cc2bb0a759c50a6ba2ba9291d2425fc99348b32889455a4850 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/1b/345711ba56a1cc2bb0a759c50a6ba2ba9291d2425fc99348b32889455a4850 deleted file mode 100644 index d65c0f0..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/1b/345711ba56a1cc2bb0a759c50a6ba2ba9291d2425fc99348b32889455a4850 +++ /dev/null @@ -1,686 +0,0 @@ -I"ľţWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The for(x in y) {
- do something}
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The basic logic of a for loop is:
- for(x in y){
- do something}
Inside the parenthesis is the condition that must evaluate to TRUE if the content in the curly braces is to run.
- -In line 4, for(name in all_names){
says as long as there’s a name in the list of all names, do whatever is inside the curly braces. name
can be whatever you want. It’s a placeholder value. I can have it say for(dogs in all_names){
it will still do the exact same thing. A lot of time you’ll see it as an i
for item. I like to use more descriptive language because, again, for loops are my Achilles’ heel.
The all_names
part is where ever you want R to look for the data. It will change based on your data set and variable naming conventions.
In line 5, I save the split data sets.
- -st_write()
is part of the sf package which allows us to create shapefiles. This can be any saving function (eg. write_csv() if you want to save CSVs). The function takes several arguments. In line 43 above I showed the basic structure: st_write(data, path/to/file.shp). This is good if you only have one file, but since I’m saving them in a loop I don’t want all of the files to have the same name. R will error out after the first and tell you the file already exists.
The first part split_states[[name]]
is still telling R what data to save, but using an index instead of a specific data frame name. To access an index you use data[[some-value]]
where some-value
is the index location. In my code, R will take the split_states
data and go alright the first index location in [[name]]
is 1 and return whatever value is stored in that index (here, AK). It will then do that for every index location as it loops through the split_states
data.
paste0()
is also part of base R - it’s apparently faster than paste()
. It concatenates (or links together) different pieces into one. I’m using it to create the filename. Within the paste0
call anything within quotation marks is static. So every file will be saved to "shapefiles/shifted/states/individual/"
and every file will have the extension .shp
. What will change with each loop is the name
of the file. One by one, R will loop through and save each file using the name
it pulled from all_names
.
st_write()
automatically creates the other three files that each “shapefile” needs. When the loop is done, you should have a folder of 200 files (50 states * 4 files each). Which is why I strongly recommend using DVC if you’re doing any kind of version control.
That’s all the processing done for the state files… for now. In part VI I’ll return to the states to create each state’s own map. Next up, in part V, I’m going back to my base map with the National Parks to add in some informational tool tips and interactivity.
- -* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
This is part three of my cartography in R series. If you are just finding this, I suggest taking a look at part I and part II first.
- -In this post, I will download and process the National Park data. Once that’s done, I’ll add it to the base map I created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-
## load data
- states <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
to reflect wherever you saved the shifted shapefile.
If your data processing and base map creation are in the same file, you can skip this line, and when you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -The National Park Service provides all the data we’ll need to make the map. The data is accessible on the ArcGIS’ Open Data website. Once you click on the link you’ll see a bunch of icons that lead to different data that’s available for download. Click on the one for boundaries.
- - - -From here, you’ll be taken to a list of available National Park data. The second link should be nps boundary which contains the shape data for all the National Parks in the United States. The file contains all the data for the park outlines along with hiking trails, rest areas, and lots of other data.
- - - -The nps boundary link will take you to a map showing the national parks. On the left, there will be a download link on the left.
- - - -From here, you’ll have a few download options. The National Park Service provides the data in different formats including CSV and Shapefile. You’ll want to download the shapefile version.
- - - -Be sure to save the file somewhere on your hard drive that is easy to find. When it finishes downloading, be sure to unzip the file. There will be four files inside the folder. All of them need to be kept in the same location. Even though we’ll only load the .shp
file, R uses the three others to create the necessary shapes.
The code below may look intimidating, but it’s fairly straight forward. I’ll go over each line below.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- ## load and process nps data
- nps <- read_sf("./shapefiles/original/nps/NPS_-_Land_Resources_Division_Boundary_and_Tract_Data_Service.shp") %>%
- select(STATE, UNIT_TYPE, PARKNAME, Shape__Are, geometry) %>%
- filter(STATE %!in% territories) %>%
- mutate(type = case_when(UNIT_TYPE == "International Historic Site" ~ "International Historic Site", # there's 23 types of national park, I wanted to reduce this number.
- UNIT_TYPE == "National Battlefield Site" ~ "National Military or Battlefield", # lines 56-77 reduce the number of park types
- UNIT_TYPE == "National Military Park" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Battlefield" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Historical Park" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Site" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Trail" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Memorial" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Monument" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Preserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Reserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Recreation Area" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Lakeshore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Wild & Scenic River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Seashore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Trails Syste" ~ "National Trail",
- UNIT_TYPE == "National Scenic Trail" ~ "National Trail",
- UNIT_TYPE == "National Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Parkway" ~ "National Park or Parkway",
- UNIT_TYPE == "Other Designation" ~ "Other National Land Area")) %>%
- mutate(visited = case_when(PARKNAME == "Joshua Tree" ~ "visited",
- PARKNAME == "Redwood" ~ "visited",
- PARKNAME == "Santa Monica Mountains" ~ "visited",
- PARKNAME == "Sequoia" ~ "visited",
- PARKNAME == "Kings Canyon" ~ "visited",
- PARKNAME == "Lewis and Clark" ~ "visited",
- PARKNAME == "Mount Rainier" ~ "visited",
- TRUE ~ "not visited")) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted national park data
- st_write(nps, "~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
In part I of this series I talked about how R has an %in%
function, but not a %!in%
function. Here’s where the latter function shines.
The United States is still an empire with its associated territories and islands. In this project I am interested in the 50 states - without these other areas. As a result, I need to filter them out. Using base R’s %in%
function I would have to create a variable that contains the postal abbreviations for all 50 states. That is annoying. Instead, I want to use the shorter list that only includes the US’ associated islands and territories. To do so, however, I need to use the operator tools’ %!in%
function.
Line 2 creates the list of US territories that I filter out in line 7. The c()
function in R means combine or concatenate. Inside the parenthesis are the five postal codes for the American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, and the Virgin Islands.
nps <- read_sf("path/to/file.shp")
loads the National Park data set to a variable called nps
using the read_sf()
function that is part of the sf package. You will need to change the file path so it reflects where you saved the data on your hard drive.
The %>%
operator is part of the tidyverse package. It tells R to go to the next line and process the next command. It has to go at the end of a line, rather than the beginning.
select
is part of the tidyverse package. With it, we can select columns by their name rather than their associated number. Large data sets take more computing power because the computer has to iterate over more rows. Unfortunately, rendering maps also takes a lot of computing power so I like to discard any unnecessary columns to reduce the amount of effort my computer has to exert.
Deciding on which columns to keep will depend on the data you’re using and what you want to map (or analyze). I know for my project I want to include a few things:
-There’s a couple ways to inspect the data to see what kind of information is available.
- -view(nps)
but as the number of data points increases, so does R’s struggle with opening it. I’ve found that VSCode doesn’t throw as big of a fit as R Studio when opening large data sets.data.frame(colnames(nps))
. This will return a list of the data set’s column names. This is my preferred method. I then go to the documentation to see what each column contains. This isn’t fool-proof because it really depends on if the data has good documentation.The National Park data includes a lot of information about who created the data and maintains the property. I’m not interested in this, so in line 6 I select the following columns:
-The geometry column is specific to shapefiles and it includes the coordinates of the shape. It will be kept automatically - unless you use the st_drop_geometry()
function. I like to specifically select so I remember it’s there.
In line 7 I use the territories list I created in line 2 to filter out the United States’ associated areas. Since the nps data uses the two character state abbreviation, I have to use the two character abbreviation for the territories. Searching for “Guam,” for example, won’t work.
- -filter()
is part of the tidyverse and it uses conditional language. In the parentheses is a condition that must be true if the tidyverse is going to keep the row. Starting at the top of the data, R goes “alright, does the value in the STATE column match any of the values in the territories list?” If the condition is TRUE, R adds the row to the new data frame.
%!in%
operator, any row that evaluates as TRUE will be kept because the value is NOT found in the territories list. If I wanted to keep only the territories, I would use the %in%
operator and only the rows with STATE abbreviations found in the territories list would be kept. For example, if the STATE value in row 1 is CA, filter looks at it and goes “is CA NOT IN territories?” If that is TRUE, keep it because we want only the values that are NOT IN the territories list.
- -mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
The NPS data set has 23 different types of National Parks listed (you can view all of them by running levels(as.factor(nps$UNIT_TYPE))
). I know that in later posts, I’m going to color code the land by type (blue for rivers, green for national parks, etc) so I wanted to reduce the number of colors I would have to use.
mutate()
’s first argument, type =
creates a new column called type
. R will populate the newly created column with whatever comes after the first (singular) equal =
sign. For example, I can put type = NA
and every row in the column will say NA
.
Here, I am using the case_when()
function, which is also part of the tidyverse. The logic of case_when
is fairly straight forward. The first value is the name of the column you want R to look in (here: UNIT_TYPE
). Next, is a conditional. Here I am looking for an exact match (==
) to the string (words) inside the first set of quotation marks (in line 8: "International Historic Site"
). The last part of the argument is what I want R to put in the type
column when it finds a row where the UNIT_TYPE
is "International Historic Site"
.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
Lines 9-29 do the same thing for the other park types. You can reduce the parks however you want or use all 23 types. Just remember that the value before the tilde ~
has to match the values found in the data exactly. For example, in line 24 I change the NPS data’s National Trail Syste value to be National Trail. Whomever created the data set did not spell system correctly, so for R to match the value I also have to omit the last letter in system.
Lines 30-37 use the same mutate()
and case_when
logic as above. Instead of reducing the number of park types, I use it to mark the different parks I have visited.
Line 30 creates the new column, visited
and uses case_when
to look for the names of the parks that I’ve been to. If I have visited them, it adds visited
to the column of the same name.
The last line, TRUE ~ "not_visited))
, acts as an else statement. For any park not listed above, it will put not visited
in the visited
column I created.
This feels like a very brute-force method of tracking which parks I’ve visited, but I haven’t spend much time trying to find another way.
- -In part I, when I made the base map, I moved Alaska and Hawaii so they were of similar size and closer to the continental USA. For the map to display the parks correctly, I have to shift them as well.
- -I went over these two lines in part II, so I won’t go over them again here. If you want to read more about them, check out that post.
- -The last line uses the st_transform()
function from the sf package to covert the data set from NAD83 to WGS84. Leaflet requires WGS84, so be sure to include this line at the end of your data manipulation.
I covered the WGS84 ellipsoid in part I, if you want to read more about it.
- -Strictly speaking, this line isn’t necessary. You can do all your data processing in the same file where you make your map, but I prefer to separate the steps into different files.
- -As a result, I save the shifted data to my hard drive so it’s easier to load later. I usually have this line commented out (by placing #
at the start of the line) after I save it the first time. I don’t want it to save every time I run the rest of the code.
1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-
## create usa Base Map using leaflet()
- map <- leaflet() %>%
- addPolygons(data = states,
- smoothFactor = 0.2,
- fillColor = "#808080",
- fillOpacity = 0.5,
- stroke = TRUE,
- weight = 0.5,
- opacity = 0.5,
- color = "#808080",
- highlight = highlightOptions(
- weight = 0.5,
- color = "#000000",
- fillOpacity = 0.7,
- bringToFront = FALSE),
- group = "Base Map") %>%
- addPolygons(data = nps,
- smoothFactor = 0.2,
- fillColor = "#354f52",
- fillOpacity = 1,
- stroke = TRUE,
- weight = 1,
- opacity = 0.5,
- color = "#354f52",
- highlight = highlightOptions(
- weight = 3,
- color = "#fff",
- fillOpacity = 0.8,
- bringToFront = TRUE),
- group = "National Parks") %>%
- addLayersControl(
- baseGroups = "Base Map",
- overlayGroups = "National Parks",
- options = layersControlOptions(collapsed = FALSE))
-
Lines 2-16 are identical to those in part II where I created the base map. I am not going to cover these sections in detail, because I covered it previously.
- -To add the National Park data to the base map, we call addPolygons()
again. The arguments are the same as before - color, opacity, outline style - just with different values. By changing those values, we can differentiate the base map from the national park data.
Since we’re mapping the National Parks and not the states, we have to tell R where the data is located using data = nps
.
smoothFactor()
determines how detailed the park boundaries should be. The lower the number, the more detailed the shape. The higher the number, the smoother the parks will render. I usually match this to whatever I set for the base map for consistency.
Define the color and transparency of the National Parks. In a future post, I am going to change the color of each type of public land, but for now, I’ll make them all a nice sage green color #354f52
. I also want to make the parks to be fully opaque.
The next four lines (21-24) define what kind of outline the National Parks will have. I detail each of these arguments in part II of this series.
- -Briefly, I want there to be an outline to each park (stroke = TRUE
) that’s thicker weight = 1
than the outline used on the base map. I do not like the way it looks at full opacity, so I make it half-transparent (opacity = 0.5
). Finally, I want the outline color = "#354f52
to be the same color as the fill. This will matter more when I change the fill color of the parks later on.
Lines 25-28 define the National Park’s behavior on mouseover. First we have to define and initialize the highlightOptions()
function. The function take similar arguments as the addPolygons
function - both of which I go over in detail in part II.
I want to keep the mouseover behavior noticeable, but simple. To do so, I set the outline’s thickness to be weight = 3
. This will give the shape a nice border that differentiates it from the rest of the map.
color = "#fff
sets the outline’s color on mouseover only. So, when inactive, the outline color will match the fill color, but on mouseover the outline color switches to white (#fff
).
bringToFront
can either be TRUE
or FALSE
. If TRUE
, Leaflet will bring the park to the forefront on mouseover. This is useful later when we add in the state parks because national and state parks tend to be close together.
When FALSE
the shape will remain static.
Since Leaflet adds all new data to the top of the base map, I think it’s useful to group the layers together. In the next block of code, we add in some layer functionality. For now, though, I want to add the National Parks to their own group so I can hide the National Parks if I want.
- -addLayersControl
defines how layers are displayed on the final map. The function takes three arguments.
First, we have to tell Leaflet which layer should be used as the base map: baseGroups = "Base Map"
. The name in the quotations (here: "Base Map"
) has to match the name given to the layer you set in the addPolygons()
call. In line 14, I put the 50 states into a group called "Base Map"
, but you can name it anything you like.
There can be more than one base map, too. It’s not super helpful here since I shifted Alaska and Hawaii, but when using map tiles you can add multiple types of base maps that users can switch between.
- -Next, we have to define the layers that are shown on top of the base group: overlayGroups = "National Parks"
. Just like the base map, this is defined in the corresponding addPolygons
call. Here, I called the layer National Parks
in line 30.
Finally, on the map I don’t want the layers to be collapsed, so I set options = layersControlOptions(collapsed = FALSE)
. When TRUE
the map will display an icon in the top right that, when clicked, will show the available layers.
Hey, look at that! You made a base map and you added some National Park data to it. You’re a certified cartographer now!
- -In the next part IV post we’ll download and process the state park data before adding it to the map. Part V of this series we’ll add Shiny functionality and some additional markers.
- - -</figure>
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/1d/f02237ce189d314dfa286e0f263a1e488e15a42f4f48134ec26601492652fe b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/1d/f02237ce189d314dfa286e0f263a1e488e15a42f4f48134ec26601492652fe new file mode 100644 index 0000000..dd9c293 --- /dev/null +++ b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/1d/f02237ce189d314dfa286e0f263a1e488e15a42f4f48134ec26601492652fe @@ -0,0 +1,30 @@ +I"ŮHere you’ll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+ +I have done my best to keep my views out of it.
+ +I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+ +If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page]
+ +Clink on a link to download the PDF.
+ +It made a difference to that one. + ++ +
I have a simple motto in life: Do what you can, where you are, with what you have. As a result, I believe strongly in doing whatever is in my means to make the world a better place. Below are some groups that I have either founded or joined in order to help those around me.</p>
+ +Political Science Methodology Group | Co-organizer with Melina Much
+University of California, Irvine
Political Science Womxn’s Caucus | Student leader
+University of California, Irvine
Political Science Workshop Coordinator
+University of California, Irvine
Legal Politics Writing Workshop
+University of California, Irvine
Center for Democracy: Writing Workshop | Member
+University of California, Irvine
UCI Humanities: Writing Workshop | Member
+University of California, Irvine
Friends of the San Dimas Dog Park | Ambassador
+San Dimas, California
Prisoner Education Project | Volunteer
+Pomona, California
Tails of the City | Volunteer Photographer
+Los Angeles, California
Philosophy Club | President, Graphic Designer, and Banquet Chair
+California State Polytechnic University, Pomona
Long Distance Voter | Intern
+Social Media Content Creator
Free Press | Intern
+Social Media Content Creator
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- -The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management and the Bureau of Reclamation. Having visited the park, I can tell you there’s no fences blocking these areas off. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area. It will be a good test case to make sure I’m selecting the correct data.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/27/12aa8f79bd9e22edc713eea4ef92bd6a4a4a31cbbca0bd31ab37fb4c291d03 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/27/12aa8f79bd9e22edc713eea4ef92bd6a4a4a31cbbca0bd31ab37fb4c291d03 new file mode 100644 index 0000000..8c88e6e --- /dev/null +++ b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/27/12aa8f79bd9e22edc713eea4ef92bd6a4a4a31cbbca0bd31ab37fb4c291d03 @@ -0,0 +1,3 @@ +I"…Welcome to part five of my cartography in R series. In this post I’ll return to the maps created in part II and part III to include a Shiny information box and popups linking to posts about my adventures in the National Parks.
+ +:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/29/9ccf70e786b1023deae25b915c0bbde5d7f79a5973d22363bc86205b4ca3cb b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/29/9ccf70e786b1023deae25b915c0bbde5d7f79a5973d22363bc86205b4ca3cb deleted file mode 100644 index b77547b..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/29/9ccf70e786b1023deae25b915c0bbde5d7f79a5973d22363bc86205b4ca3cb +++ /dev/null @@ -1,434 +0,0 @@ -I"šWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway"))
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here you’ll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+ +I have done my best to keep my views out of it.
+ +I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+ +If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page]
+ +Clink on a link to download the PDF.
+Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
This is a continuation of my previous post where I walked through how to download and modify shape data. I also showed how to shift Alaska and Hawaii so they are closer to the continental usa. -
- -In this post, I’ll go over how to use Leaflet to map the shapefile we made in the previous post. If you’ve come here from part one of the series, you probably have the libraries and data loaded already. However, if you don’t, be sure to load the libraries and shapefiles before moving to number two.
- -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -II. cartography in r part two [this post]
-III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-
## load data
- states <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
to reflect wherever you saved the shifted shapefile.
If your data processing and base map creation are in the same file, you can skip this line, and when you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -At its most basic, all Leaflet needs to create a map is a base map and data layers. The code below may look intimidating, but it’s mostly style options.
- -This is the map we’re going to create. It’s a simple grey map and each state darkens in color as you hover over it. I’ll show the same map after each style option is added so you can see what effect it has.
- - - -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-
## create usa base map using leaflet()
- map <- leaflet() %>%
- addPolygons(data = states,
- smoothFactor = 0.2,
- fillColor = "#808080",
- fillOpacity = 0.5,
- stroke = TRUE,
- weight = 0.5,
- opacity = 0.5,
- color = "#808080",
- highlight = highlightOptions(
- weight = 0.5,
- color = "#000000",
- fillOpacity = 0.7,
- bringToFront = FALSE),
- group = "Base Map")
-
leaflet()
initializes the map widget. I save it to a variable called map (map <-
) so I can run other code in the file without recreating the map each time. When you want to see the map, you can type map
(or whatever you want to name your map) in the terminal and hit enter. R will display the map in the viewer.
addPolygons()
adds a layer to the map widget. Leaflet has different layer options, including addTiles
and addMarkers
which do different things. You can read about them on the leaflet website. Since we’re using a previously created shapefile, we’ll add the shapefile to the map using addPolygons()
.
The first argument you need to specify after calling addPolygons is data = [data-source]
. [data-source]
is whatever variable your data is stored in. For me, it’s called states
. This is either the processed data from part I of this series or the saved shapefile loaded above under the section called load data.
When you run only the first two lines, Leaflet will use its default styling. The base color will be a light blue and the outlines of the states will be dark blue and fairly thick.
- - - -You can leave the base map like this if you want, but all additional data will be added as a layer on top</i>* of this map which can become distracting very quickly. I prefer to make my base maps as basic and unobtrusive as possible so the data I add on top of the base map is more prominent.
- -smoothFactor
controls how much the polygon shape should be smoothed at each zoom level. The lower the number the more accurate your shapes will be. A larger number, on the other hand, will lead to better performance, but can distort the shapes of known areas.
I keep the smoothFactor
low because I want the United States to appear as a coherent land mass. The image below shows three different maps, each with a different smoothFactor to illustrate what this argument does. On the left, the map’s smoothFactor=0.2
, the center map’s smoothFactor=10
, and the right’s smoothFactor=100
.
As you can see, the higher the smoothFactor
the less coherent the United States becomes.
addPolygons()
.
-fillColor
refers to what color is on the inside of the polygons. Since I want a minimal base map, I usually set this value to be some shade of grey. If you want a different color, you only need to replace #808080
with the corresponding hex code for the color you want. Here is a useful hex color picker. If you have a hex value and you want the same color in a different shade, this is a useful site.
fillOpacity
determines how transparent the color inside the shape should be. I set mine to be 0.5
because I like the way it looks. The number can be between 0 and 1 with 1 being fully opaque and 0 being fully transparent.
The next four lines define the appearance of the shapes’ outline.
- -The stroke
property can be set to either TRUE
or FALSE
. When true, Leaflet adds an outline around each polygon. When false, the polygons have no outline. In the image below, the map on the left has the default outlines and on the right stroke = FALSE
.
weight = 0.5
sets the thickness of the outlines to be 0.5 pixels. This can be any value you want with higher numbers corresponding to thicker lines. Lower numbers correspond to thinner lines.
The opacity
property operates in the same way as fill opacity above, but on the outlines. The number can be between 0 and 1. Lower numbers correspond to the lines being more transparent and 1 means fully opaque.
color = "#808080"
sets the color of the outline. I typically set it to be the same color as the fill color.
If you want a static base map then lines 2-10 are all you need, as shown in the image below. I like to add some functionality to my base map so that the individual states become darker when they’re hovered over.
- - - -Lines 11-15 define the map’s behavior when the mouse hovers over the shape. Most of the options are the same as the ones used on the base polygon shapes, so I won’t go into them with much detail.
- -highlight = highlightOptions()
contains the mouseover specifications. The word before the equal sign has to be either highlight
or highlightOptions
. I am not sure why you have to declare highlight twice, but you do.
highlightOptions()
is the actual function call.
weight
, color
, and fillOpacity
all operate in the same way as before, but whatever values you specify here will only show up when the mouse hovers over.
bringToFront
takes one of two values: TRUE
or FALSE
. It only really matters when you have multiple layers (like we will in later parts of this series). When bringToFront = TRUE
hovering over the state will bring it to the front. When bringToFront = FALSE
it will stay in the back.
Since the base map has only one layer, this property doesn’t affect anything.
- -group = "Base Map")
lets you group multiple layers together. This argument will come in handy as we add more information to the map. The base map is the default layer and is always visible - though, when you use map tiles you can define multiple base layers. All other layers will be on top of the base layer. When using different groups, you can define functionality that allows users to turn off certain layers.
You’ve created your first base map! It’s a boring flat, grey map, but it’s the base we’ll use when adding in the national and state park data. In part III of this series we’ll process and add in the National Parks.
- - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/32/64cce75ff13780ece87dede51e072bf71b04cc300fb1718d8271c2df5ac124 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/32/64cce75ff13780ece87dede51e072bf71b04cc300fb1718d8271c2df5ac124 deleted file mode 100644 index 7c5a78c..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/32/64cce75ff13780ece87dede51e072bf71b04cc300fb1718d8271c2df5ac124 +++ /dev/null @@ -1,621 +0,0 @@ -I"ŁçWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -Here you’ll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+ +I have done my best to keep my views out of it.
+ +I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+ +If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page]
+ +It made a difference to that one. + ++ +
I have a simple motto in life: Do what you can, where you are, with what you have. As a result, I believe strongly in doing whatever is in my means to make the world a better place. Below are some groups that I have either founded or joined in order to help those around me.</p>
+ +UCI Humanities: Writing Workshop | Member +University of California, Irvine
+ +<h1>Previous Groups</h1>
+<hr class = "h-line">
+<ul>
+ <li><i>Friends of the San Dimas Dog Park</i> | Ambassador <br/>
+ San Dimas, California </li><br/>
+ <li><i>Prisoner Education Project</i> | Volunteer <br/>
+ Pomona, California</li><br/>
+ <li><i>Tails of the City</i> | Volunteer Photographer <br/>
+ Los Angeles, California</li><br/>
+ <li><i>Philosophy Club</i> | President, Graphic Designer, and Banquet Chair <br/>
+ California State Polytechnic University, Pomona</li><br/>
+ <li><i><a href = "https://www.voteamerica.com/">Long Distance Voter</a></i> | Intern <br/>
+ Social Media Content Creator</li><br/>
+ <li><i><a href = "https://www.freepress.net/">Free Press</a></i> | Intern <br/>
+ Social Media Content Creator</li>
+</ul> <!-- </div>
+
</div> + –>
+:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/33/75b1c677d7b1e5cedf1c6642297bc124aadd75275e2902761f1726023d2982 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/33/75b1c677d7b1e5cedf1c6642297bc124aadd75275e2902761f1726023d2982 deleted file mode 100644 index 1427d40..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/33/75b1c677d7b1e5cedf1c6642297bc124aadd75275e2902761f1726023d2982 +++ /dev/null @@ -1,516 +0,0 @@ -I"˝µWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway"))
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm)) in the terminal, it will return a list of the 50 state abbreviations. That is what we're telling R to do here.
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-
state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
Here you’ll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+ +I have done my best to keep my views out of it.
+ +I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+ +If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page]
+ +Clink on a link to download the PDF.
+Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column I know I won’t use in my map.
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/49/dbea98cb249de42b1c8ae86a404143ff0e816cf4c142a746a5675e5e493221 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/49/dbea98cb249de42b1c8ae86a404143ff0e816cf4c142a746a5675e5e493221 new file mode 100644 index 0000000..25b8a85 --- /dev/null +++ b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/49/dbea98cb249de42b1c8ae86a404143ff0e816cf4c142a746a5675e5e493221 @@ -0,0 +1,55 @@ +I" + +It made a difference to that one. + ++ +
I have a simple motto in life: Do what you can, where you are, with what you have. As a result, I believe strongly in doing whatever is in my means to make the world a better place. Below are some groups that I have either founded or joined in order to help those around me.</p>
+ +UCI Humanities: Writing Workshop | Member
+University of California, Irvine
<h1>Previous Groups</h1>
+<hr class = "h-line">
+<ul>
+ <li><i>Friends of the San Dimas Dog Park</i> | Ambassador <br/>
+ San Dimas, California </li><br/>
+ <li><i>Prisoner Education Project</i> | Volunteer <br/>
+ Pomona, California</li><br/>
+ <li><i>Tails of the City</i> | Volunteer Photographer <br/>
+ Los Angeles, California</li><br/>
+ <li><i>Philosophy Club</i> | President, Graphic Designer, and Banquet Chair <br/>
+ California State Polytechnic University, Pomona</li><br/>
+ <li><i><a href = "https://www.voteamerica.com/">Long Distance Voter</a></i> | Intern <br/>
+ Social Media Content Creator</li><br/>
+ <li><i><a href = "https://www.freepress.net/">Free Press</a></i> | Intern <br/>
+ Social Media Content Creator</li>
+</ul> <!-- </div>
+
</div> + –>
+:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/4a/98107153859527c24a981ee82410cc3d75b010352704f29da87c5c82d76850 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/4a/98107153859527c24a981ee82410cc3d75b010352704f29da87c5c82d76850 deleted file mode 100644 index 85c4818..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/4a/98107153859527c24a981ee82410cc3d75b010352704f29da87c5c82d76850 +++ /dev/null @@ -1,298 +0,0 @@ -I"g[This is a continuation of my previous post where I walked through how to download and modify shape data. I also showed how to shift Alaska and Hawaii so they are closer to the continental usa. -
- -In this post, I’ll go over how to use Leaflet to map the shapefile we made in the previous post. If you’ve come here from part one of the series, you probably have the libraries and data loaded already. However, if you don’t, be sure to load the libraries and shapefiles before moving to number two.
- -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-
## load data
- states <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
to reflect wherever you saved the shifted shapefile.
If your data processing and base map creation are in the same file, you can skip this line, and when you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -At its most basic, all Leaflet needs to create a map is a base map and data layers. The code below may look intimidating, but it’s mostly style options.
- -This is the map we’re going to create. It’s a simple grey map and each state darkens in color as you hover over it. I’ll show the same map after each style option is added so you can see what effect it has.
- - - -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-
## create usa base map using leaflet()
- map <- leaflet() %>%
- addPolygons(data = states,
- smoothFactor = 0.2,
- fillColor = "#808080",
- fillOpacity = 0.5,
- stroke = TRUE,
- weight = 0.5,
- opacity = 0.5,
- color = "#808080",
- highlight = highlightOptions(
- weight = 0.5,
- color = "#000000",
- fillOpacity = 0.7,
- bringToFront = FALSE),
- group = "Base Map")
-
leaflet()
initializes the map widget. I save it to a variable called map (map <-
) so I can run other code in the file without recreating the map each time. When you want to see the map, you can type map
(or whatever you want to name your map) in the terminal and hit enter. R will display the map in the viewer.
addPolygons()
adds a layer to the map widget. Leaflet has different layer options, including addTiles
and addMarkers
which do different things. You can read about them on the leaflet website. Since we’re using a previously created shapefile, we’ll add the shapefile to the map using addPolygons()
.
The first argument you need to specify after calling addPolygons is data = [data-source]
. [data-source]
is whatever variable your data is stored in. For me, it’s called states
. This is either the processed data from part I of this series or the saved shapefile loaded above under the section called load data.
When you run only the first two lines, Leaflet will use its default styling. The base color will be a light blue and the outlines of the states will be dark blue and fairly thick.
- - - -You can leave the base map like this if you want, but all additional data will be added as a layer on top</i>* of this map which can become distracting very quickly. I prefer to make my base maps as basic and unobtrusive as possible so the data I add on top of the base map is more prominent.
- -smoothFactor
controls how much the polygon shape should be smoothed at each zoom level. The lower the number the more accurate your shapes will be. A larger number, on the other hand, will lead to better performance, but can distort the shapes of known areas.
I keep the smoothFactor
low because I want the United States to appear as a coherent land mass. The image below shows three different maps, each with a different smoothFactor to illustrate what this argument does. On the left, the map’s smoothFactor=0.2
, the center map’s smoothFactor=10
, and the right’s smoothFactor=100
.
As you can see, the higher the smoothFactor
the less coherent the United States becomes.
addPolygons()
.
-fillColor
refers to what color is on the inside of the polygons. Since I want a minimal base map, I usually set this value to be some shade of grey. If you want a different color, you only need to replace #808080
with the corresponding hex code for the color you want. Here is a useful hex color picker. If you have a hex value and you want the same color in a different shade, this is a useful site.
fillOpacity
determines how transparent the color inside the shape should be. I set mine to be 0.5
because I like the way it looks. The number can be between 0 and 1 with 1 being fully opaque and 0 being fully transparent.
The next four lines define the appearance of the shapes’ outline.
- -The stroke
property can be set to either TRUE
or FALSE
. When true, Leaflet adds an outline around each polygon. When false, the polygons have no outline. In the image below, the map on the left has the default outlines and on the right stroke = FALSE
.
weight = 0.5
sets the thickness of the outlines to be 0.5 pixels. This can be any value you want with higher numbers corresponding to thicker lines. Lower numbers correspond to thinner lines.
The opacity
property operates in the same way as fill opacity above, but on the outlines. The number can be between 0 and 1. Lower numbers correspond to the lines being more transparent and 1 means fully opaque.
color = "#808080"
sets the color of the outline. I typically set it to be the same color as the fill color.
If you want a static base map then lines 2-10 are all you need, as shown in the image below. I like to add some functionality to my base map so that the individual states become darker when they’re hovered over.
- - - -Lines 11-15 define the map’s behavior when the mouse hovers over the shape. Most of the options are the same as the ones used on the base polygon shapes, so I won’t go into them with much detail.
- -highlight = highlightOptions()
contains the mouseover specifications. The word before the equal sign has to be either highlight
or highlightOptions
. I am not sure why you have to declare highlight twice, but you do.
highlightOptions()
is the actual function call.
weight
, color
, and fillOpacity
all operate in the same way as before, but whatever values you specify here will only show up when the mouse hovers over.
bringToFront
takes one of two values: TRUE
or FALSE
. It only really matters when you have multiple layers (like we will in later parts of this series). When bringToFront = TRUE
hovering over the state will bring it to the front. When bringToFront = FALSE
it will stay in the back.
Since the base map has only one layer, this property doesn’t affect anything.
- -group = "Base Map")
lets you group multiple layers together. This argument will come in handy as we add more information to the map. The base map is the default layer and is always visible - though, when you use map tiles you can define multiple base layers. All other layers will be on top of the base layer. When using different groups, you can define functionality that allows users to turn off certain layers.
You’ve created your first base map! It’s a boring flat, grey map, but it’s the base we’ll use when adding in the national and state park data. In part III of this series we’ll process and add in the National Parks.
- - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/4d/c72004c5f4ff0cdce4229f29166e31c1cd058c705c1b7ac83f349606ae82c3 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/4d/c72004c5f4ff0cdce4229f29166e31c1cd058c705c1b7ac83f349606ae82c3 new file mode 100644 index 0000000..19d5421 --- /dev/null +++ b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/4d/c72004c5f4ff0cdce4229f29166e31c1cd058c705c1b7ac83f349606ae82c3 @@ -0,0 +1,32 @@ +I"ŕHere you’ll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+ +I have done my best to keep my views out of it.
+ +I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+ +If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page]
+ +Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
Twitter is a great resource for engaging with the academic community. For example, I saw this Tweet by PhD Genie asking users to name one positive skill learned during their PhD. I love this question for a number of reasons. First, it helps PhDs reframe their experience so it’s applicable outside of academia - which can help when applying to jobs. Second, it’s really cool to see what skills other people have learned during their program.
-
I responded to the tweet because during my PhD I learned how to create maps in R. I started by recreating a map from the University of North Carolina’s Hussman School of Journalism’s News Deserts project (below). Now, I am working on a personal project mapping the U.S. National and State parks.
- - - -There was quite a bit of interest in how to do this, so in this series of posts I will document my process from start to finish.
- -First, I’m not an expert. I wanted to make a map, so I learned how. There may be easier ways and, if I learn how to do them, I’ll write another post.
- -Second, before starting, I strongly suggest setting up a Github and DVC. I wrote about how to use GitHub, the Github Website, and Github Desktop. You can use any of these methods to manage your repositories. I use all three based purely on whatever mood I’m in.
- -If you do use Git or GitHub, then DVC (data version control) is mandatory. GitHub will warn you that your file is too large if it’s over 50MB and reject your pushes if the files are over 100MB. The total repository size can’t exceed 2GB if you’re using the free version (which I am). DVC is useful because cartography files are large. They contain a lot of coordinates which increases with each location you try to map. DVC will store your data outside of GitHub but allows you to track changes with your data. It’s super useful.
- -Third, there are several ways to make a map. R is capable of making interactive maps and static maps. Static maps are less computationally expensive and better for publication. Interactive maps are prettier and better for displaying on the web.
- -I make interactive maps with Leaflet and Shiny because they offer a lot of functionality. The most common way is to use map tiles. Map tiles use data from sources like Open Street Map and Maps to create map squares (tiles) with custom data on top. A list of available map tiles is available on the Open Street Maps website.
- - - -When I make static maps (like the US map pictured above), I use ggplot
- -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-You only need to install the packages once. You can do so by running each line in the terminal. When you rerun the code later, you can skip right to loading the packages using library("package-name")
1
-2
-3
-4
-5
-6
-7
-8
-
## you only need to install the packages once
-
- install.packages("leaflet") # interactive maps
- install.packages("shiny") # added map functionality
- install.packages("tidyverse") # data manipulation
- install.packages("tigris") # cartographic boundaries
- install.packages("operator.tools") # for the not-in function
- install.packages("sf") # read and write shapefiles
-
leaflet()
)addTiles()
, addPolygons()
, or addMarkers()
.
- The Tidyverse is a collection of packages used for data manipulation and analysis. Its syntax is more intuitive than base R. Furthermore, you can chain (aka pipe) commands together.
- -For cartography, you don’t need the whole Tidyverse. We’ll mainly use dplyr
and ggplot
. You can install these packages individually instead of installing the whole tidyverse. Though, when we get to the national park database, we’ll also need purr
and tidyr
.
operator.tools is not required, but it’s recommended.
- -For some unknown reason, base R has a %in%
function but not a not-in
function. Unfortunately, the United States is still an empire with it’s associated areas, islands, and pseudo-states. I only want to include the 50 states, so I needed a way to easily filter out the non-states. Operator tool’s %!!in%
function is perfect for that.
To start, create and save a new file called usa.r
. In it, we’re going to download and modify the United States shape data that we’ll use to create the base map in part two of this series.
At the beginning of each file, you have to load the necessary packages. In this file, the only packages we need to load are tidyverse, sf, and tigris. I also load leaflet to make sure the map renders correctly.
- -1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse")
- library("sf")
- library("tigris")
- library("leaflet")
-
There’s two ways to download the USA shape data. First, we can use the R package, tigris. Second, we can download it from the Census website.
- -I prefer using tigris but I’ve been having some problems with it. Sometimes it ignores the Great Lakes and merges Michigan and Wisconsin into a Frankenstate (boxed in red below).
- - - -tigris()
downloads the TIGER/Shapefile data directly from the Census and includes a treasure trove of data. Some of the data includes land area, water area, state names, and geometry.
Tigris can also download boundaries for counties, divisions, regions, tracts, blocks, congressional and school districts, and a whole host of other groupings. A complete list of available data can be found on the packages’ GitHub.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-
## download state data using tigris()
- us_states <- tigris::states(cb = FALSE, year = 2020) %>%
- filter(STATEFP < 57) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted shapefile
- st_write(us_states, "path/to/file/usa.shp")
-
Here we create the us_states
variable, save the geographic data to it, move Alaska and Hawaii so they’re beneath the continental US, and save the shifted shapefile.
R uses the <-
operator to define new variables. Here, we’re naming our new variable us_states
.
In our us_states
variable we’re going to store data on the 50 states downloaded using tigris
. Within (::
) tigris, we’re going to use the states()
function.
The states()
function allows you to pull state-level data from the Census. This function takes several arguments
The cb
argument can either be TRUE
or FALSE
. If cb = FALSE
tells Tigris() to download the most detailed shapefile. If cb = TRUE
it will download a generalized (1:5000k) file. After a lot of trial and error, I found that using cb = TRUE
prevents the Frankenstate from happening.
If the year
argument is omitted it will download the shapefile for the default year (currently 2020). I set out of habit from when I work with county boundaries. When I work with county boundaries I have to set the year because their boundaries change more than states.
Finally, the %>%
operator is part of the Tidyverse. It basically tells R “Hey! I’m not done, keep going to the next line!”
tigris::states()
downloads data for the 50 states and the United States’ minor outlying islands, Puerto Rico, and its associated territories. Each state and territory is assigned a unique two-digit Federal Information Processing Standard [FIPS] code.
They’re mostly consecutive (Alaska is 01) but when they were conceived of in the 1970s a couple were reserved for the US territories (American Samoa was 03), but in the updated version the “reserved codes” were left out and the territories were assigned to new numbers (American Samoa is now 60). The important bit about this is that the last official state (Wyoming) has a FIPS of 56.
- -This line of code uses the filter()
function on the STATEFP
variable downloaded using Tigris(). All it says is keep any row that has a FIPS of less than 57. This will keep only the 50 states and exclude the United States’ empire associated territories.
The shift_geometry()
is from the Tigris package. It takes two arguments preserve_area
and position
.
When preserve_area = FALSE
tigris will shrink Alaska’s size and increase Hawaii’s so that they are comparable to the size of the other states.
The position
argument can either be "below"
or "outside"
. When it’s below
, both Alaska and Hawaii are moved to be below California. When it’s outside
then Alaska is moved to be near Washington and Hawaii is moved to be near California.
Since I’m a born-theorist, I should warn you that messing with maps has inherent normative implications. The most common projection is Mercator which stretches the continents near the poles and squishes the ones near the equator.
- - - -One of the competing projections is Gall-Peters which claims to be more accurate because it was - at the time it was created in the 1980s - the only “area-correct map.” Though it has now been criticized for skewing the polar continents and the equatorial ones. The above photo shows you just how different the projects are from one another.
- -The problem arises because we’re trying to project a 3D object into 2D space. It’s a classic case of even though we can, maybe we shouldn’t. Computers can do these computations and change the projections to anything we want fairly easily. However, humans think and exist in metaphors. We assume bigger = better and up = good. When we project maps that puts the Northern Hemisphere as both upwards and larger than other parts of the world we are imbuing that projection with metaphorical meaning.
- -I caution you to be careful when creating maps. Think through the implications of something as simple as making Alaska more visually appealing by distorting it to be of similar size as the other states.
- -If you want to read more about map projections this is a good post. If you want to read more about metaphors, I suggest Metaphors We Live By by George Lakoff and Mark Johnson.
- -The sf
package includes a function called st_transform()
which will reproject the data for us. There are a lot of projects. You can read them at the proj website.
Leaflet requires all boundaries use the World Geodetic Service 1984 (WGS84) coordinate system. While making maps I’ve come across two main coordinate systems: WGS84 and North American Datum (1983). WGS84 uses the WGS84 ellipsoid and NAD83 uses the Geodetic Reference System (GRS80). From what I’ve gathered, the differences are slight, but leaflet requires WGS and the Census uses NAD83. As a result, we have to reproject the the data in order to make our map.
- -The st_transform
function takes four arguments, each preceded by a +
. All four arguments are required to transform the data from NAD83 to WGS84.
Briefly, +proj=longlat
tells R to use project the code into longitude and latitude [rather than, for example, transverse mercator (tmerc
)].
+ellps=WGS84
sets the ellipsoid to the WGS84 standard.
+datum=WGS84
is a holdover from previous proj releases. It tells R to use the WGS84 data.
+no_defs
is also a holdover.
Essentially, you need to include line 6 before you create the map, but after you do any data manipulation. It might throw some warnings which you can just ignore.
- -In the last line, we save the data we manipulated in lines 2-6. Strictly speaking you don’t have to save the shapefile. You can manipulate the data and then skip right to mapping the data. I caution against it because the files can get unreadable once you start using multiple data sets. I usually comment out line 9 after I save the file. That way I’m not saving and re-saving it whenever I need to run the code above it.
- -The st_write()
function is part of the sf
package and it takes two arguments. The first is the data set you want to save. Since I used us_states
to save the data, it will be the first argument in the st_write()
function call.
The second argument is the path to where you want the file saved and what name you want to give it. I named mine usa
. It is mandatory that you add .shp
to the end of the filepath so that R knows to save it as a shapefile.
Although it’s called a shapefile, it’s actually four files. I usually create a separate folder for each set of shapefiles and store that in one master folder called shapefiles. An example of my folder structure is below. I keep all of this in my GitHub repo and track changes using DVC.
- - - -On my C://
drive is My Documents
. In that folder I keep a GitHub
folder that holds all my repos, including my nps
one. Inside the nps
folder I separate my shapefiles into their own folder. For this tutorial I am using original and shifted shapefiles, so I’ve also separated them into two separate folders to keep things neat. I also know I’m going to have multiple shapefiles (one for the USA, one for the National Parks, and a final one for the State Parks) so I created a folder for each set. In the usa
folder I saved the shifted states shapefile.
Altogether, my line 9 would read:
- - - -Running that line will save the four necessary files that R needs to load the geographic data.
- -That’s it for method 1 using tigris
. The next section, method 2, shows how to load and transform a previously downloaded shapefile. If you used method 1, feel free to leave this post and go directly to mapping the shapefile in part II of this series.
In this section, I’ll go through the process of downloading the shapefiles from the Census website. If you tried method 1 and tigris caused the weird Frankenstate, you can try using the data downloaded from the Census website. I don’t know why it works, since tigris uses the same data, but it does.
- -Generally, though, finding and using shapefiles created by others is a great way to create cool maps. There are thousands of shapefiles available, many from ArcGis’ Open Data Website.
- -Save the file wherever you want, but I prefer to keep it within the “original” shapefiles folder in a sub-folder called “zips.” Once it downloads, unzip it - again, anywhere is fine. It will download all 30 Census shapefiles. We’re only going to use the one called “cb_2021_us_state_500k.zip”. The rest you can delete, if you want.
- - - -When you unzip the cb_2021_us_state_500k.zip, it will contain four files. You’ll only ever work with the .shp
file, but the other three are used in the background to display the data.
Once all the files are unzipped, we can load the .shp
file into R.
1
-2
-3
-4
-5
-6
-7
-8
-9
-
## load a previously downloaded shapefile
- usa <- read_sf("shapefiles/original/usa/states/cb_2021_us_state_500k.shp") %>%
- filter(STATEFP < 57) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted shapefile
- st_write(usa, "path/to/file/usa.shp")
-
Everything except line 2 is the same as in method 1. I won’t go over lines 3-9 here, because all the information is above.
- -This line is very similar to the one above. I changed the name of the variable to usa
so I could keep both methods in the same R file (each R variable needs to be unique or it will be overwritten).
read_sf
is part of the sf() package. It’s used to load shapefiles into R. The path to the file is enclosed in quotation marks and parentheses. Simply navigate to wherever you unzipped the cb_2021_us_state_500k file and choose the file with the .shp
extension.
Once the shapefiles are downloaded - either using tigris() or by loading the shapefiles from the Census website - you can create the base map. I’ll tackle making the base map in part II of this series.*
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/53/a7d803818d88739e14dbd3513573f7d4d900156de2cbec5a5e032fe1030e2d b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/53/a7d803818d88739e14dbd3513573f7d4d900156de2cbec5a5e032fe1030e2d deleted file mode 100644 index 8261b95..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/53/a7d803818d88739e14dbd3513573f7d4d900156de2cbec5a5e032fe1030e2d +++ /dev/null @@ -1,237 +0,0 @@ -I"ZAWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- -The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management and the Bureau of Reclamation. Having visited the park, I can tell you there’s no fences blocking these areas off. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area. It will be a good test case to make sure I’m selecting the correct data.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “hello world!”**
- -* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.\n
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “hello world!”**
- -print("Hello World!")
Twitter is a great resource for engaging with the academic community. For example, I saw this Tweet by PhD Genie asking users to name one positive skill learned during their PhD. I love this question for a number of reasons. First, it helps PhDs reframe their experience so it’s applicable outside of academia - which can help when applying to jobs. Second, it’s really cool to see what skills other people have learned during their program.
-
I responded to the tweet because during my PhD I learned how to create maps in R. I started by recreating a map from the University of North Carolina’s Hussman School of Journalism’s News Deserts project (below). Now, I am working on a personal project mapping the U.S. National and State parks.
- - - -There was quite a bit of interest in how to do this, so in this series of posts I will document my process from start to finish.
- -First, I’m not an expert. I wanted to make a map, so I learned how. There may be easier ways and, if I learn how to do them, I’ll write another post.
- -Second, before starting, I strongly suggest setting up a Github and DVC. I wrote about how to use GitHub, the Github Website, and Github Desktop. You can use any of these methods to manage your repositories. I use all three based purely on whatever mood I’m in.
- -If you do use Git or GitHub, then DVC (data version control) is mandatory. GitHub will warn you that your file is too large if it’s over 50MB and reject your pushes if the files are over 100MB. The total repository size can’t exceed 2GB if you’re using the free version (which I am). DVC is useful because cartography files are large. They contain a lot of coordinates which increases with each location you try to map. DVC will store your data outside of GitHub but allows you to track changes with your data. It’s super useful.
- -Third, there are several ways to make a map. R is capable of making interactive maps and static maps. Static maps are less computationally expensive and better for publication. Interactive maps are prettier and better for displaying on the web.
- -I make interactive maps with Leaflet and Shiny because they offer a lot of functionality. The most common way is to use map tiles. Map tiles use data from sources like Open Street Map and Maps to create map squares (tiles) with custom data on top. A list of available map tiles is available on the Open Street Maps website.
- - - -When I make static maps (like the US map pictured above), I use ggplot
- -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- -I. cartography in r part one [this post]
-III. cartography in r part three
-IV. cartography in r part four
-You only need to install the packages once. You can do so by running each line in the terminal. When you rerun the code later, you can skip right to loading the packages using library("package-name")
1
-2
-3
-4
-5
-6
-7
-8
-
## you only need to install the packages once
-
- install.packages("leaflet") # interactive maps
- install.packages("shiny") # added map functionality
- install.packages("tidyverse") # data manipulation
- install.packages("tigris") # cartographic boundaries
- install.packages("operator.tools") # for the not-in function
- install.packages("sf") # read and write shapefiles
-
leaflet()
)addTiles()
, addPolygons()
, or addMarkers()
.
- The Tidyverse is a collection of packages used for data manipulation and analysis. Its syntax is more intuitive than base R. Furthermore, you can chain (aka pipe) commands together.
- -For cartography, you don’t need the whole Tidyverse. We’ll mainly use dplyr
and ggplot
. You can install these packages individually instead of installing the whole tidyverse. Though, when we get to the national park database, we’ll also need purr
and tidyr
.
operator.tools is not required, but it’s recommended.
- -For some unknown reason, base R has a %in%
function but not a not-in
function. Unfortunately, the United States is still an empire with it’s associated areas, islands, and pseudo-states. I only want to include the 50 states, so I needed a way to easily filter out the non-states. Operator tool’s %!!in%
function is perfect for that.
To start, create and save a new file called usa.r
. In it, we’re going to download and modify the United States shape data that we’ll use to create the base map in part two of this series.
At the beginning of each file, you have to load the necessary packages. In this file, the only packages we need to load are tidyverse, sf, and tigris. I also load leaflet to make sure the map renders correctly.
- -1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse")
- library("sf")
- library("tigris")
- library("leaflet")
-
There’s two ways to download the USA shape data. First, we can use the R package, tigris. Second, we can download it from the Census website.
- -I prefer using tigris but I’ve been having some problems with it. Sometimes it ignores the Great Lakes and merges Michigan and Wisconsin into a Frankenstate (boxed in red below).
- - - -tigris()
downloads the TIGER/Shapefile data directly from the Census and includes a treasure trove of data. Some of the data includes land area, water area, state names, and geometry.
Tigris can also download boundaries for counties, divisions, regions, tracts, blocks, congressional and school districts, and a whole host of other groupings. A complete list of available data can be found on the packages’ GitHub.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-
## download state data using tigris()
- us_states <- tigris::states(cb = FALSE, year = 2020) %>%
- filter(STATEFP < 57) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted shapefile
- st_write(us_states, "path/to/file/usa.shp")
-
Here we create the us_states
variable, save the geographic data to it, move Alaska and Hawaii so they’re beneath the continental US, and save the shifted shapefile.
R uses the <-
operator to define new variables. Here, we’re naming our new variable us_states
.
In our us_states
variable we’re going to store data on the 50 states downloaded using tigris
. Within (::
) tigris, we’re going to use the states()
function.
The states()
function allows you to pull state-level data from the Census. This function takes several arguments
The cb
argument can either be TRUE
or FALSE
. If cb = FALSE
tells Tigris() to download the most detailed shapefile. If cb = TRUE
it will download a generalized (1:5000k) file. After a lot of trial and error, I found that using cb = TRUE
prevents the Frankenstate from happening.
If the year
argument is omitted it will download the shapefile for the default year (currently 2020). I set out of habit from when I work with county boundaries. When I work with county boundaries I have to set the year because their boundaries change more than states.
Finally, the %>%
operator is part of the Tidyverse. It basically tells R “Hey! I’m not done, keep going to the next line!”
tigris::states()
downloads data for the 50 states and the United States’ minor outlying islands, Puerto Rico, and its associated territories. Each state and territory is assigned a unique two-digit Federal Information Processing Standard [FIPS] code.
They’re mostly consecutive (Alaska is 01) but when they were conceived of in the 1970s a couple were reserved for the US territories (American Samoa was 03), but in the updated version the “reserved codes” were left out and the territories were assigned to new numbers (American Samoa is now 60). The important bit about this is that the last official state (Wyoming) has a FIPS of 56.
- -This line of code uses the filter()
function on the STATEFP
variable downloaded using Tigris(). All it says is keep any row that has a FIPS of less than 57. This will keep only the 50 states and exclude the United States’ empire associated territories.
The shift_geometry()
is from the Tigris package. It takes two arguments preserve_area
and position
.
When preserve_area = FALSE
tigris will shrink Alaska’s size and increase Hawaii’s so that they are comparable to the size of the other states.
The position
argument can either be "below"
or "outside"
. When it’s below
, both Alaska and Hawaii are moved to be below California. When it’s outside
then Alaska is moved to be near Washington and Hawaii is moved to be near California.
Since I’m a born-theorist, I should warn you that messing with maps has inherent normative implications. The most common projection is Mercator which stretches the continents near the poles and squishes the ones near the equator.
- - - -One of the competing projections is Gall-Peters which claims to be more accurate because it was - at the time it was created in the 1980s - the only “area-correct map.” Though it has now been criticized for skewing the polar continents and the equatorial ones. The above photo shows you just how different the projects are from one another.
- -The problem arises because we’re trying to project a 3D object into 2D space. It’s a classic case of even though we can, maybe we shouldn’t. Computers can do these computations and change the projections to anything we want fairly easily. However, humans think and exist in metaphors. We assume bigger = better and up = good. When we project maps that puts the Northern Hemisphere as both upwards and larger than other parts of the world we are imbuing that projection with metaphorical meaning.
- -I caution you to be careful when creating maps. Think through the implications of something as simple as making Alaska more visually appealing by distorting it to be of similar size as the other states.
- -If you want to read more about map projections this is a good post. If you want to read more about metaphors, I suggest Metaphors We Live By by George Lakoff and Mark Johnson.
- -The sf
package includes a function called st_transform()
which will reproject the data for us. There are a lot of projects. You can read them at the proj website.
Leaflet requires all boundaries use the World Geodetic Service 1984 (WGS84) coordinate system. While making maps I’ve come across two main coordinate systems: WGS84 and North American Datum (1983). WGS84 uses the WGS84 ellipsoid and NAD83 uses the Geodetic Reference System (GRS80). From what I’ve gathered, the differences are slight, but leaflet requires WGS and the Census uses NAD83. As a result, we have to reproject the the data in order to make our map.
- -The st_transform
function takes four arguments, each preceded by a +
. All four arguments are required to transform the data from NAD83 to WGS84.
Briefly, +proj=longlat
tells R to use project the code into longitude and latitude [rather than, for example, transverse mercator (tmerc
)].
+ellps=WGS84
sets the ellipsoid to the WGS84 standard.
+datum=WGS84
is a holdover from previous proj releases. It tells R to use the WGS84 data.
+no_defs
is also a holdover.
Essentially, you need to include line 6 before you create the map, but after you do any data manipulation. It might throw some warnings which you can just ignore.
- -In the last line, we save the data we manipulated in lines 2-6. Strictly speaking you don’t have to save the shapefile. You can manipulate the data and then skip right to mapping the data. I caution against it because the files can get unreadable once you start using multiple data sets. I usually comment out line 9 after I save the file. That way I’m not saving and re-saving it whenever I need to run the code above it.
- -The st_write()
function is part of the sf
package and it takes two arguments. The first is the data set you want to save. Since I used us_states
to save the data, it will be the first argument in the st_write()
function call.
The second argument is the path to where you want the file saved and what name you want to give it. I named mine usa
. It is mandatory that you add .shp
to the end of the filepath so that R knows to save it as a shapefile.
Although it’s called a shapefile, it’s actually four files. I usually create a separate folder for each set of shapefiles and store that in one master folder called shapefiles. An example of my folder structure is below. I keep all of this in my GitHub repo and track changes using DVC.
- - - -On my C://
drive is My Documents
. In that folder I keep a GitHub
folder that holds all my repos, including my nps
one. Inside the nps
folder I separate my shapefiles into their own folder. For this tutorial I am using original and shifted shapefiles, so I’ve also separated them into two separate folders to keep things neat. I also know I’m going to have multiple shapefiles (one for the USA, one for the National Parks, and a final one for the State Parks) so I created a folder for each set. In the usa
folder I saved the shifted states shapefile.
Altogether, my line 9 would read:
- - - -Running that line will save the four necessary files that R needs to load the geographic data.
- -That’s it for method 1 using tigris
. The next section, method 2, shows how to load and transform a previously downloaded shapefile. If you used method 1, feel free to leave this post and go directly to mapping the shapefile in part II of this series.
In this section, I’ll go through the process of downloading the shapefiles from the Census website. If you tried method 1 and tigris caused the weird Frankenstate, you can try using the data downloaded from the Census website. I don’t know why it works, since tigris uses the same data, but it does.
- -Generally, though, finding and using shapefiles created by others is a great way to create cool maps. There are thousands of shapefiles available, many from ArcGis’ Open Data Website.
- -Save the file wherever you want, but I prefer to keep it within the “original” shapefiles folder in a sub-folder called “zips.” Once it downloads, unzip it - again, anywhere is fine. It will download all 30 Census shapefiles. We’re only going to use the one called “cb_2021_us_state_500k.zip”. The rest you can delete, if you want.
- - - -When you unzip the cb_2021_us_state_500k.zip, it will contain four files. You’ll only ever work with the .shp
file, but the other three are used in the background to display the data.
Once all the files are unzipped, we can load the .shp
file into R.
1
-2
-3
-4
-5
-6
-7
-8
-9
-
## load a previously downloaded shapefile
- usa <- read_sf("shapefiles/original/usa/states/cb_2021_us_state_500k.shp") %>%
- filter(STATEFP < 57) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted shapefile
- st_write(usa, "path/to/file/usa.shp")
-
Everything except line 2 is the same as in method 1. I won’t go over lines 3-9 here, because all the information is above.
- -This line is very similar to the one above. I changed the name of the variable to usa
so I could keep both methods in the same R file (each R variable needs to be unique or it will be overwritten).
read_sf
is part of the sf() package. It’s used to load shapefiles into R. The path to the file is enclosed in quotation marks and parentheses. Simply navigate to wherever you unzipped the cb_2021_us_state_500k file and choose the file with the .shp
extension.
Once the shapefiles are downloaded - either using tigris() or by loading the shapefiles from the Census website - you can create the base map. I’ll tackle making the base map in part II of this series.*
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/5c/bf7f3d5d72a108dc02f59e832fb832167b3d45db5bf3e123a6da07c1c13f23 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/5c/bf7f3d5d72a108dc02f59e832fb832167b3d45db5bf3e123a6da07c1c13f23 deleted file mode 100644 index 1ed99fd..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/5c/bf7f3d5d72a108dc02f59e832fb832167b3d45db5bf3e123a6da07c1c13f23 +++ /dev/null @@ -1,646 +0,0 @@ -I"ŚńWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm) # split the data by state
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 7.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
Here you’ll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+ +I have done my best to keep my views out of it.
+ +I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+ +If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page]
+ +Clink on a link to download the PDF.
+Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps and link them to the map of National Parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/5f/571a2bc7ca1fae55f3f2e509b9628eef379f451ad9895961cce87bd375dbd7 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/5f/571a2bc7ca1fae55f3f2e509b9628eef379f451ad9895961cce87bd375dbd7 deleted file mode 100644 index d1382d2..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/5f/571a2bc7ca1fae55f3f2e509b9628eef379f451ad9895961cce87bd375dbd7 +++ /dev/null @@ -1,617 +0,0 @@ -I"ůćWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the US Map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save it, and in part VI of this never-ending series* I’ll create individual state maps and link them to the map of National Parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo]
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/60/8fe78e1377ab217a91c1fd12543cdd0874db21b17dad0110e22a31681f4808 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/60/8fe78e1377ab217a91c1fd12543cdd0874db21b17dad0110e22a31681f4808 deleted file mode 100644 index 037d08f..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/60/8fe78e1377ab217a91c1fd12543cdd0874db21b17dad0110e22a31681f4808 +++ /dev/null @@ -1,622 +0,0 @@ -I"céWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “hello world!”**
- -* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- -The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management and the Bureau of Reclamation. Having visited the park, I can tell you there’s no fences blocking these areas off. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area. It will be a good test case to make sure I’m selecting the correct data.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -The table below shows a
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/61/e69b138269c656b580f2f597c5983db8321ec193bc95175b6f675ddb065435 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/61/e69b138269c656b580f2f597c5983db8321ec193bc95175b6f675ddb065435 deleted file mode 100644 index 715b456..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/61/e69b138269c656b580f2f597c5983db8321ec193bc95175b6f675ddb065435 +++ /dev/null @@ -1,622 +0,0 @@ -I"céWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-6
-
split_states <- split(state_parks, f = state_parks$State_Nm) # split the data by state
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))
- }
-
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the US Map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/66/1ccf2036c7646d2cdc40ba21b1428777bcfb919680fa851f1456eccd32b8e6 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/66/1ccf2036c7646d2cdc40ba21b1428777bcfb919680fa851f1456eccd32b8e6 deleted file mode 100644 index ccc29fc..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/66/1ccf2036c7646d2cdc40ba21b1428777bcfb919680fa851f1456eccd32b8e6 +++ /dev/null @@ -1,403 +0,0 @@ -I"ysWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
This is a continuation of my previous post where I walked through how to download and modify shape data. I also showed how to shift Alaska and Hawaii so they are closer to the continental usa. -
- -In this post, I’ll go over how to use Leaflet to map the shapefile we made in the previous post. If you’ve come here from part one of the series, you probably have the libraries and data loaded already. However, if you don’t, be sure to load the libraries and shapefiles before moving to number two.
- -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-
## load data
- states <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
to reflect wherever you saved the shifted shapefile.
If your data processing and base map creation are in the same file, you can skip this line, and when you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -At its most basic, all Leaflet needs to create a map is a base map and data layers. The code below may look intimidating, but it’s mostly style options.
- -This is the map we’re going to create. It’s a simple grey map and each state darkens in color as you hover over it. I’ll show the same map after each style option is added so you can see what effect it has.
- - - -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-
## create usa base map using leaflet()
- map <- leaflet() %>%
- addPolygons(data = states,
- smoothFactor = 0.2,
- fillColor = "#808080",
- fillOpacity = 0.5,
- stroke = TRUE,
- weight = 0.5,
- opacity = 0.5,
- color = "#808080",
- highlight = highlightOptions(
- weight = 0.5,
- color = "#000000",
- fillOpacity = 0.7,
- bringToFront = FALSE),
- group = "Base Map")
-
leaflet()
initializes the map widget. I save it to a variable called map (map <-
) so I can run other code in the file without recreating the map each time. When you want to see the map, you can type map
(or whatever you want to name your map) in the terminal and hit enter. R will display the map in the viewer.
addPolygons()
adds a layer to the map widget. Leaflet has different layer options, including addTiles
and addMarkers
which do different things. You can read about them on the leaflet website. Since we’re using a previously created shapefile, we’ll add the shapefile to the map using addPolygons()
.
The first argument you need to specify after calling addPolygons is data = [data-source]
. [data-source]
is whatever variable your data is stored in. For me, it’s called states
. This is either the processed data from part I of this series or the saved shapefile loaded above under the section called load data.
When you run only the first two lines, Leaflet will use its default styling. The base color will be a light blue and the outlines of the states will be dark blue and fairly thick.
- - - -You can leave the base map like this if you want, but all additional data will be added as a layer on top</i>* of this map which can become distracting very quickly. I prefer to make my base maps as basic and unobtrusive as possible so the data I add on top of the base map is more prominent.
- -smoothFactor
controls how much the polygon shape should be smoothed at each zoom level. The lower the number the more accurate your shapes will be. A larger number, on the other hand, will lead to better performance, but can distort the shapes of known areas.
I keep the smoothFactor
low because I want the United States to appear as a coherent land mass. The image below shows three different maps, each with a different smoothFactor to illustrate what this argument does. On the left, the map’s smoothFactor=0.2
, the center map’s smoothFactor=10
, and the right’s smoothFactor=100
.
As you can see, the higher the smoothFactor
the less coherent the United States becomes.
addPolygons()
.
-fillColor
refers to what color is on the inside of the polygons. Since I want a minimal base map, I usually set this value to be some shade of grey. If you want a different color, you only need to replace #808080
with the corresponding hex code for the color you want. Here is a useful hex color picker. If you have a hex value and you want the same color in a different shade, this is a useful site.
fillOpacity
determines how transparent the color inside the shape should be. I set mine to be 0.5
because I like the way it looks. The number can be between 0 and 1 with 1 being fully opaque and 0 being fully transparent.
The next four lines define the appearance of the shapes’ outline.
- -The stroke
property can be set to either TRUE
or FALSE
. When true, Leaflet adds an outline around each polygon. When false, the polygons have no outline. In the image below, the map on the left has the default outlines and on the right stroke = FALSE
.
weight = 0.5
sets the thickness of the outlines to be 0.5 pixels. This can be any value you want with higher numbers corresponding to thicker lines. Lower numbers correspond to thinner lines.
The opacity
property operates in the same way as fill opacity above, but on the outlines. The number can be between 0 and 1. Lower numbers correspond to the lines being more transparent and 1 means fully opaque.
color = "#808080"
sets the color of the outline. I typically set it to be the same color as the fill color.
If you want a static base map then lines 2-10 are all you need, as shown in the image below. I like to add some functionality to my base map so that the individual states become darker when they’re hovered over.
- - - -Lines 11-15 define the map’s behavior when the mouse hovers over the shape. Most of the options are the same as the ones used on the base polygon shapes, so I won’t go into them with much detail.
- -highlight = highlightOptions()
contains the mouseover specifications. The word before the equal sign has to be either highlight
or highlightOptions
. I am not sure why you have to declare highlight twice, but you do.
highlightOptions()
is the actual function call.
weight
, color
, and fillOpacity
all operate in the same way as before, but whatever values you specify here will only show up when the mouse hovers over.
bringToFront
takes one of two values: TRUE
or FALSE
. It only really matters when you have multiple layers (like we will in later parts of this series). When bringToFront = TRUE
hovering over the state will bring it to the front. When bringToFront = FALSE
it will stay in the back.
Since the base map has only one layer, this property doesn’t affect anything.
- -group = "Base Map")
lets you group multiple layers together. This argument will come in handy as we add more information to the map. The base map is the default layer and is always visible - though, when you use map tiles you can define multiple base layers. All other layers will be on top of the base layer. When using different groups, you can define functionality that allows users to turn off certain layers.
You’ve created your first base map! It’s a boring flat, grey map, but it’s the base we’ll use when adding in the national and state park data. In part III of this series we’ll process and add in the National Parks.
- - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/6b/0340d6d0e09e12280d7bacd55cb2b37abfdd0ee344e1490065cc821e544984 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/6b/0340d6d0e09e12280d7bacd55cb2b37abfdd0ee344e1490065cc821e544984 deleted file mode 100644 index 74b8b2b..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/6b/0340d6d0e09e12280d7bacd55cb2b37abfdd0ee344e1490065cc821e544984 +++ /dev/null @@ -1,423 +0,0 @@ -I"é~Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The for(x in y) {
- do something}
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers.
- - - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/6e/e2e6438058e870a3597eef7f3a107ae94f22eae3aebce97a0d1d3cb71df603 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/6e/e2e6438058e870a3597eef7f3a107ae94f22eae3aebce97a0d1d3cb71df603 deleted file mode 100644 index b75df04..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/6e/e2e6438058e870a3597eef7f3a107ae94f22eae3aebce97a0d1d3cb71df603 +++ /dev/null @@ -1,615 +0,0 @@ -I"×ăWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the US Map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/6f/a20d79a1cc8eadc33e524d2b9a64394f609eed0b9e399e496049f9f2119dbb b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/6f/a20d79a1cc8eadc33e524d2b9a64394f609eed0b9e399e496049f9f2119dbb deleted file mode 100644 index dad2a22..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/6f/a20d79a1cc8eadc33e524d2b9a64394f609eed0b9e399e496049f9f2119dbb +++ /dev/null @@ -1,624 +0,0 @@ -I"iéWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet every time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “hello world!”**
- -print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is Stat. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
.
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/72/6df7e46db37ef1176ec5f53dc80e1b83dfe9e99a195ba7216dea4fdfe7d454 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/72/6df7e46db37ef1176ec5f53dc80e1b83dfe9e99a195ba7216dea4fdfe7d454 deleted file mode 100644 index 708bd50..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/72/6df7e46db37ef1176ec5f53dc80e1b83dfe9e99a195ba7216dea4fdfe7d454 +++ /dev/null @@ -1,686 +0,0 @@ -I"ĽţWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm)) in the terminal, it will return a list of the 50 state abbreviations. That is what we're telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The for(x in y) {
- do something}
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The basic logic of a for loop is:
- for(x in y) {
- do something}
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
Here you’ll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+ +I have done my best to keep my views out of it.
+ +I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+ +If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page]
+ + <a href="https://github.com/liz-muehlmann/Election_Guides/raw/main/California/Primary%20Elections/National%20and%20State/2022%20-%20Primary%20-%20California.pdf" download="2022_Primary_CA.pdf">2022 Primary California</a> s
+
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call.
Common filter operators include &
(and), |
(or), <
(less than), or </code>></code> (greater than).
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences blocking these areas off. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/78/256997bdedae112ce920a2c66baf2f0d450fbc3dd2c7e20070149f8ac457ef b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/78/256997bdedae112ce920a2c66baf2f0d450fbc3dd2c7e20070149f8ac457ef deleted file mode 100644 index 94b7c29..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/78/256997bdedae112ce920a2c66baf2f0d450fbc3dd2c7e20070149f8ac457ef +++ /dev/null @@ -1,503 +0,0 @@ -I"2¦Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway"))
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-
state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The basic logic of a for loop is:
- for(x in y){
- do something}
Inside the parenthesis is the condition that must evaluate to TRUE if the content in the curly braces is to run.
- -In line 4, for(name in all_names){
says as long as there’s a name in the list of all names, do whatever is inside the curly braces. name
can be whatever you want. It’s a placeholder value. I can have it say for(dogs in all_names){
it will still do the exact same thing. A lot of time you’ll see it as an i
for item. I like to use more descriptive language because, again, for loops are my Achilles’ heel.
The all_names
part is where ever you want R to look for the data. It will change based on your data set and variable naming conventions.
In line 5, I save the split data sets.
- -st_write()
is part of the sf package which allows us to create shapefiles. This can be any saving function (eg. write_csv() if you want to save CSVs). The function takes several arguments. In line 43 above I showed the basic structure: st_write(data, path/to/file.shp). This is good if you only have one file, but since I’m saving them in a loop I don’t want all of the files to have the same name. R will error out after the first and tell you the file already exists.
The first part split_states[[name]]
is still telling R what data to save, but using an index instead of a specific data frame name. To access an index you use data[[some-value]]
where some-value
is the index location. In my code, R will take the split_states
data and go alright the first index location in [[name]]
is 1 and return whatever value is stored in that index (here, AK). It will then do that for every index location as it loops through the split_states
data.
paste0()
is also part of base R.
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The basic logic of a for loop is:
- for(x in y){
- do something}
Inside the parenthesis is the condition that must evaluate to TRUE if the content in the curly braces is to run.
- -In line 4, for(name in all_names){
says as long as there’s a name in the list of all names, do whatever is inside the curly braces. name
can be whatever you want. It’s a placeholder value. I can have it say for(dogs in all_names){
it will still do the exact same thing. A lot of time you’ll see it as an i
for item. I like to use more descriptive language because, again, for loops are my Achilles’ heel.
The all_names
part is where ever you want R to look for the data. It will change based on your data set and variable naming conventions.
In line 5, I save the split data sets.
- -st_write()
is part of the sf package which allows us to create shapefiles. This can be any saving function (eg. write_csv() if you want to save CSVs). The function takes several arguments. In line 43 above I showed the basic structure: st_write(data, path/to/file.shp). This is good if you only have one file, but since I’m saving them in a loop I don’t want all of the files to have the same name. R will error out after the first and tell you the file already exists.
The first part split_states[[name]]
is still telling R what data to save, but using an index instead of a specific data frame name. To access an index you use data[[some-value]]
where some-value
is the index location. In my code, R will take the split_states
data and go alright the first index location in [[name]]
is 1 and return whatever value is stored in that index (here, AK). It will then do that for every index location as it loops through the split_states
data.
paste0()
is also part of base R - it’s apparently faster than paste()
. It concatenates (or links together) different pieces into one. I’m using it to create the filename. Within the paste0
call anything within quotation marks is static. So every file will be saved to "shapefiles/shifted/states/individual/"
and every file will have the extension .shp
. What will change with each loop is the name
of the file. One by one, R will loop through and save each file using the name
it pulled from all_names
.
st_write()
automatically creates the other three files that each “shapefile” needs. When the loop is done, you should have a folder of 200 files (50 states * 4 files each). Which is why I strongly recommend using DVC if you’re doing any kind of version control.
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- -The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management and the Bureau of Reclamation. Having visited the park, I can tell you there’s no fences blocking these areas off. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area. It will be a good test case to make sure I’m selecting the correct data.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The basic logic of a for loop is:
- for(x in y){
- do something}
Inside the parenthesis is the condition that must evaluate to TRUE if the content in the curly braces is to run.
- -In line 4, for(name in all_names){
says as long as there’s a name in the list of all names, do whatever is inside the curly braces. name
can be whatever you want. It’s a placeholder value. I can have it say for(dogs in all_names){
it will still do the exact same thing. A lot of time you’ll see it as an i
for item. I like to use more descriptive language because, again, for loops are my Achilles’ heel.
The all_names
part is where ever you want R to look for the data. It will change based on your data set and variable naming conventions.
In line 5, I save the split data sets.
- -st_write()
is part of the sf package which allows us to create shapefiles. This can be any saving function (eg. write_csv() if you want to save CSVs). The function takes several arguments. In line 43 above I showed the basic structure: st_write(data, path/to/file.shp). This is good if you only have one file, but since I’m saving them in a loop I don’t want all of the files to have the same name. R will error out after the first and tell you the file already exists.
The first part split_states[[name]]
is still telling R what data to save, but using an index instead of a specific data frame name. To access an index you use data[[some-value]]
where some-value
is the index location. In my code, R will take the split_states
data and go alright the first index location in [[name]]
is 1 and return whatever value is stored in that index (here, AK). It will then do that for every index location as it loops through the split_states
data.
paste0()
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
Here you’ll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+ +I have done my best to keep my views out of it.
+ +I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+ +If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page]
+ +Clink on a link to download the PDF.
+:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/8b/3729b9dce6676a5ec02ede434da75a73b848be8c9aaeb49a63b163ca1eba47 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/8b/3729b9dce6676a5ec02ede434da75a73b848be8c9aaeb49a63b163ca1eba47 deleted file mode 100644 index 6ee89f5..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/8b/3729b9dce6676a5ec02ede434da75a73b848be8c9aaeb49a63b163ca1eba47 +++ /dev/null @@ -1,595 +0,0 @@ -I"¦ŕWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/8d/92e7954d80f33b4f97a12f3270341f738f014ee3cc4d1f9fd44eba9ebb2024 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/8d/92e7954d80f33b4f97a12f3270341f738f014ee3cc4d1f9fd44eba9ebb2024 deleted file mode 100644 index 0856258..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/8d/92e7954d80f33b4f97a12f3270341f738f014ee3cc4d1f9fd44eba9ebb2024 +++ /dev/null @@ -1,413 +0,0 @@ -I"9}Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- ## load and process state park data
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(d_Own_Type == "State" & d_Des_Tp == "Recreation Management Area" |
- d_Des_Tp == "State Historic or Cultural Area" |
- d_Des_Tp == "State Park" |
- d_Des_Tp == "State Wilderness") %>%
- select(d_Des_Tp, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Historic or Cultural Area" ~ "Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Park" ~ "State Park",
- d_Des_Tp == "State Wilderness" ~ "State Wilderness")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- TRUE ~ "not visited")) %>%
- shift_geometry(preserve_area = FALSE, # resizes alaska to fit with the size of the other states
- position = "below") %>% # moves alaska so it's near hawaii
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs") # changes the geographic data from NAD83 to WGS84
-
Line 2 here is the same as it was when I processed the National Park data in part III. It creates a list of the postal codes associated with the United States’ associated territories and lands. In line 6 we use it to filter these areas out of the data set because I am only interested in the 50 states.
- -Line 5 is where we actually load the data into R. Before, when I used st_layers()
I was only getting the layer information from the geopackage. Here, I use the st_read()
function which takes two arguments.
The first argument is the path to the geopackage. Geopackage files contain a lot of information, so make sure you choose the file with the .gpkg
extension.
The second argument layer
tells R which layer from the geopackage to open. Here, I load the Fee layer because it has all the data I’m interested in.
The %>%
is part of the tidyverse and it tells R to continue processing the data on the next line.
I went over the logic of this line when I did the National Park data. You can read about it more in detail in Part III. All this line does is filter out the US’ associated islands and territories.
- -The filter
function in the tidyverse is very powerful. It can do complex operations by using certain operators. Here, I’m using it to filter for ownership type. The PAD-US data includes national parks, Native American land, and military installations. I didn’t want any of these areas on my map so I filtered for "State"
ownership.
However, I didn’t want just state ownership because that also includes a lot of land areas that aren’t necessarily state parks - like watersheds or mineral areas. I use the &
operator to tell R I want land owned by the state and of a certain designation type.
The d_Des_Type
column contains the land’s designation code. I am selecting for state ownership and a designation type of “Recreation Management Area” or |
“State Historic or Cultural Area” or “State Park” or “State Wilderness” in lines 7-10.
This line will give me state owned land that is probably a state park.
- -select()
lets me choose the columns I want to keep by name, rather than by index number. I want to reduce PAD-US data size and get rid of any information I don’t need for my map. You’ll need to inspect the data and look at the documentation if you want to use different columns.
I wanted the land designation type (state park, wilderness, etc) in the d_Des_Type
column. The Park Name (Unit_Name
), state abbreviation (State_Nm
), long form of the state name (d_State_Nm
), and size of the park (GIS
).
SHAPE
column. To drop the geometry, use st_drop_geometry()
mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks.
For the state parks, though, it is unwieldy. Even in its reduced size the PAD-US data has over 5,000 parks. It would be cumbersome to sort through all 5,000 parks and individually code “visited” or “not visited.”
- -I had original wanted to drop the geometry and download the parks as a CSV, but even that was a lot.
- -Instead, I am going to focus on the parks I know I’ve visited and have photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -In line 16, I create the visited
column using the tidyverse's
mutate()
function. I populate the column using case_when()
which looks in the Unit_Nm
column for the park named Valley of Fire State Park
. Once found, it puts *visited* in the visited
column.
Line 17 operates in the same way as an else
statement. It enters not visited into the visited
column for every park not specifically listed using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 18-19 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 20 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 21 saves the shifted shapefile to the hard drive.
- -Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm)) in the terminal, it will return a list of the 50 state abbreviations. That is what we're telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-
state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part I of this series.
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/99/8b4af99d68d1a7a3e2762c1652d219fe8062ad67d3f13ade057a9aa63e7b01 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/99/8b4af99d68d1a7a3e2762c1652d219fe8062ad67d3f13ade057a9aa63e7b01 deleted file mode 100644 index 9a2778c..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/99/8b4af99d68d1a7a3e2762c1652d219fe8062ad67d3f13ade057a9aa63e7b01 +++ /dev/null @@ -1,595 +0,0 @@ -I"ŻĺThis is part three of my cartography in R series. If you are just finding this, I suggest taking a look at part I and part II first.
- -In this post, I will download and process the National Park data. Once that’s done, I’ll add it to the base map I created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-
## load data
- states <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
to reflect wherever you saved the shifted shapefile.
If your data processing and base map creation are in the same file, you can skip this line, and when you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -The National Park Service provides all the data we’ll need to make the map. The data is accessible on the ArcGIS’ Open Data website. Once you click on the link you’ll see a bunch of icons that lead to different data that’s available for download. Click on the one for boundaries.
- - - -From here, you’ll be taken to a list of available National Park data. The second link should be nps boundary which contains the shape data for all the National Parks in the United States. The file contains all the data for the park outlines along with hiking trails, rest areas, and lots of other data.
- - - -The nps boundary link will take you to a map showing the national parks. On the left, there will be a download link on the left.
- - - -From here, you’ll have a few download options. The National Park Service provides the data in different formats including CSV and Shapefile. You’ll want to download the shapefile version.
- - - -Be sure to save the file somewhere on your hard drive that is easy to find. When it finishes downloading, be sure to unzip the file. There will be four files inside the folder. All of them need to be kept in the same location. Even though we’ll only load the .shp
file, R uses the three others to create the necessary shapes.
The code below may look intimidating, but it’s fairly straight forward. I’ll go over each line below.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- ## load and process nps data
- nps <- read_sf("./shapefiles/original/nps/NPS_-_Land_Resources_Division_Boundary_and_Tract_Data_Service.shp") %>%
- select(STATE, UNIT_TYPE, PARKNAME, Shape__Are, geometry) %>%
- filter(STATE %!in% territories) %>%
- mutate(type = case_when(UNIT_TYPE == "International Historic Site" ~ "International Historic Site", # there's 23 types of national park, I wanted to reduce this number.
- UNIT_TYPE == "National Battlefield Site" ~ "National Military or Battlefield", # lines 56-77 reduce the number of park types
- UNIT_TYPE == "National Military Park" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Battlefield" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Historical Park" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Site" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Trail" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Memorial" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Monument" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Preserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Reserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Recreation Area" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Lakeshore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Wild & Scenic River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Seashore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Trails Syste" ~ "National Trail",
- UNIT_TYPE == "National Scenic Trail" ~ "National Trail",
- UNIT_TYPE == "National Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Parkway" ~ "National Park or Parkway",
- UNIT_TYPE == "Other Designation" ~ "Other National Land Area")) %>%
- mutate(visited = case_when(PARKNAME == "Joshua Tree" ~ "visited",
- PARKNAME == "Redwood" ~ "visited",
- PARKNAME == "Santa Monica Mountains" ~ "visited",
- PARKNAME == "Sequoia" ~ "visited",
- PARKNAME == "Kings Canyon" ~ "visited",
- PARKNAME == "Lewis and Clark" ~ "visited",
- PARKNAME == "Mount Rainier" ~ "visited",
- TRUE ~ "not visited")) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted national park data
- st_write(nps, "~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
In part I of this series I talked about how R has an %in%
function, but not a %!in%
function. Here’s where the latter function shines.
The United States is still an empire with its associated territories and islands. In this project I am interested in the 50 states - without these other areas. As a result, I need to filter them out. Using base R’s %in%
function I would have to create a variable that contains the postal abbreviations for all 50 states. That is annoying. Instead, I want to use the shorter list that only includes the US’ associated islands and territories. To do so, however, I need to use the operator tools’ %!in%
function.
Line 2 creates the list of US territories that I filter out in line 7. The c()
function in R means combine or concatenate. Inside the parenthesis are the five postal codes for the American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, and the Virgin Islands.
nps <- read_sf("path/to/file.shp")
loads the National Park data set to a variable called nps
using the read_sf()
function that is part of the sf package. You will need to change the file path so it reflects where you saved the data on your hard drive.
The %>%
operator is part of the tidyverse package. It tells R to go to the next line and process the next command. It has to go at the end of a line, rather than the beginning.
select
is part of the tidyverse package. With it, we can select columns by their name rather than their associated number. Large data sets take more computing power because the computer has to iterate over more rows. Unfortunately, rendering maps also takes a lot of computing power so I like to discard any unnecessary columns to reduce the amount of effort my computer has to exert.
Deciding on which columns to keep will depend on the data you’re using and what you want to map (or analyze). I know for my project I want to include a few things:
-There’s a couple ways to inspect the data to see what kind of information is available.
- -view(nps)
but as the number of data points increases, so does R’s struggle with opening it. I’ve found that VSCode doesn’t throw as big of a fit as R Studio when opening large data sets.data.frame(colnames(nps))
. This will return a list of the data set’s column names. This is my preferred method. I then go to the documentation to see what each column contains. This isn’t fool-proof because it really depends on if the data has good documentation.The National Park data includes a lot of information about who created the data and maintains the property. I’m not interested in this, so in line 6 I select the following columns:
-The geometry column is specific to shapefiles and it includes the coordinates of the shape. It will be kept automatically - unless you use the st_drop_geometry()
function. I like to specifically select so I remember it’s there.
In line 7 I use the territories list I created in line 2 to filter out the United States’ associated areas. Since the nps data uses the two character state abbreviation, I have to use the two character abbreviation for the territories. Searching for “Guam,” for example, won’t work.
- -filter()
is part of the tidyverse and it uses conditional language. In the parentheses is a condition that must be true if the tidyverse is going to keep the row. Starting at the top of the data, R goes “alright, does the value in the STATE column match any of the values in the territories list?” If the condition is TRUE, R adds the row to the new data frame.
%!in%
operator, any row that evaluates as TRUE will be kept because the value is NOT found in the territories list. If I wanted to keep only the territories, I would use the %in%
operator and only the rows with STATE abbreviations found in the territories list would be kept. For example, if the STATE value in row 1 is CA, filter looks at it and goes “is CA NOT IN territories?” If that is TRUE, keep it because we want only the values that are NOT IN the territories list.
- -mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
The NPS data set has 23 different types of National Parks listed (you can view all of them by running levels(as.factor(nps$UNIT_TYPE))
). I know that in later posts, I’m going to color code the land by type (blue for rivers, green for national parks, etc) so I wanted to reduce the number of colors I would have to use.
mutate()
’s first argument, type =
creates a new column called type
. R will populate the newly created column with whatever comes after the first (singular) equal =
sign. For example, I can put type = NA
and every row in the column will say NA
.
Here, I am using the case_when()
function, which is also part of the tidyverse. The logic of case_when
is fairly straight forward. The first value is the name of the column you want R to look in (here: UNIT_TYPE
). Next, is a conditional. Here I am looking for an exact match (==
) to the string (words) inside the first set of quotation marks (in line 8: "International Historic Site"
). The last part of the argument is what I want R to put in the type
column when it finds a row where the UNIT_TYPE
is "International Historic Site"
.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
Lines 9-29 do the same thing for the other park types. You can reduce the parks however you want or use all 23 types. Just remember that the value before the tilde ~
has to match the values found in the data exactly. For example, in line 24 I change the NPS data’s National Trail Syste value to be National Trail. Whomever created the data set did not spell system correctly, so for R to match the value I also have to omit the last letter in system.
Lines 30-37 use the same mutate()
and case_when
logic as above. Instead of reducing the number of park types, I use it to mark the different parks I have visited.
Line 30 creates the new column, visited
and uses case_when
to look for the names of the parks that I’ve been to. If I have visited them, it adds visited
to the column of the same name.
The last line, TRUE ~ "not_visited))
, acts as an else statement. For any park not listed above, it will put not visited
in the visited
column I created.
This feels like a very brute-force method of tracking which parks I’ve visited, but I haven’t spend much time trying to find another way.
- -In part I, when I made the base map, I moved Alaska and Hawaii so they were of similar size and closer to the continental USA. For the map to display the parks correctly, I have to shift them as well.
- -I went over these two lines in part II, so I won’t go over them again here. If you want to read more about them, check out that post.
- -The last line uses the st_transform()
function from the sf package to covert the data set from NAD83 to WGS84. Leaflet requires WGS84, so be sure to include this line at the end of your data manipulation.
I covered the WGS84 ellipsoid in part I, if you want to read more about it.
- -Strictly speaking, this line isn’t necessary. You can do all your data processing in the same file where you make your map, but I prefer to separate the steps into different files.
- -As a result, I save the shifted data to my hard drive so it’s easier to load later. I usually have this line commented out (by placing #
at the start of the line) after I save it the first time. I don’t want it to save every time I run the rest of the code.
1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-
## create usa Base Map using leaflet()
- map <- leaflet() %>%
- addPolygons(data = states,
- smoothFactor = 0.2,
- fillColor = "#808080",
- fillOpacity = 0.5,
- stroke = TRUE,
- weight = 0.5,
- opacity = 0.5,
- color = "#808080",
- highlight = highlightOptions(
- weight = 0.5,
- color = "#000000",
- fillOpacity = 0.7,
- bringToFront = FALSE),
- group = "Base Map") %>%
- addPolygons(data = nps,
- smoothFactor = 0.2,
- fillColor = "#354f52",
- fillOpacity = 1,
- stroke = TRUE,
- weight = 1,
- opacity = 0.5,
- color = "#354f52",
- highlight = highlightOptions(
- weight = 3,
- color = "#fff",
- fillOpacity = 0.8,
- bringToFront = TRUE),
- group = "National Parks") %>%
- addLayersControl(
- baseGroups = "Base Map",
- overlayGroups = "National Parks",
- options = layersControlOptions(collapsed = FALSE))
-
Lines 2-16 are identical to those in part II where I created the base map. I am not going to cover these sections in detail, because I covered it previously.
- -To add the National Park data to the base map, we call addPolygons()
again. The arguments are the same as before - color, opacity, outline style - just with different values. By changing those values, we can differentiate the base map from the national park data.
Since we’re mapping the National Parks and not the states, we have to tell R where the data is located using data = nps
.
smoothFactor()
determines how detailed the park boundaries should be. The lower the number, the more detailed the shape. The higher the number, the smoother the parks will render. I usually match this to whatever I set for the base map for consistency.
Define the color and transparency of the National Parks. In a future post, I am going to change the color of each type of public land, but for now, I’ll make them all a nice sage green color #354f52
. I also want to make the parks to be fully opaque.
The next four lines (21-24) define what kind of outline the National Parks will have. I detail each of these arguments in part II of this series.
- -Briefly, I want there to be an outline to each park (stroke = TRUE
) that’s thicker weight = 1
than the outline used on the base map. I do not like the way it looks at full opacity, so I make it half-transparent (opacity = 0.5
). Finally, I want the outline color = "#354f52
to be the same color as the fill. This will matter more when I change the fill color of the parks later on.
Lines 25-28 define the National Park’s behavior on mouseover. First we have to define and initialize the highlightOptions()
function. The function take similar arguments as the addPolygons
function - both of which I go over in detail in part II.
I want to keep the mouseover behavior noticeable, but simple. To do so, I set the outline’s thickness to be weight = 3
. This will give the shape a nice border that differentiates it from the rest of the map.
color = "#fff
sets the outline’s color on mouseover only. So, when inactive, the outline color will match the fill color, but on mouseover the outline color switches to white (#fff
).
bringToFront
can either be TRUE
or FALSE
. If TRUE
, Leaflet will bring the park to the forefront on mouseover. This is useful later when we add in the state parks because national and state parks tend to be close together.
When FALSE
the shape will remain static.
Since Leaflet adds all new data to the top of the base map, I think it’s useful to group the layers together. In the next block of code, we add in some layer functionality. For now, though, I want to add the National Parks to their own group so I can hide the National Parks if I want.
- -addLayersControl
defines how layers are displayed on the final map. The function takes three arguments.
First, we have to tell Leaflet which layer should be used as the base map: baseGroups = "Base Map"
. The name in the quotations (here: "Base Map"
) has to match the name given to the layer you set in the addPolygons()
call. In line 14, I put the 50 states into a group called "Base Map"
, but you can name it anything you like.
There can be more than one base map, too. It’s not super helpful here since I shifted Alaska and Hawaii, but when using map tiles you can add multiple types of base maps that users can switch between.
- -Next, we have to define the layers that are shown on top of the base group: overlayGroups = "National Parks"
. Just like the base map, this is defined in the corresponding addPolygons
call. Here, I called the layer National Parks
in line 30.
Finally, on the map I don’t want the layers to be collapsed, so I set options = layersControlOptions(collapsed = FALSE)
. When TRUE
the map will display an icon in the top right that, when clicked, will show the available layers.
Hey, look at that! You made a base map and you added some National Park data to it. You’re a certified cartographer now!
- -In the next part IV post we’ll download and process the state park data before adding it to the map. Part V of this series we’ll add Shiny functionality and some additional markers.
- - -</figure>
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/9a/80d109580bc466f81cddb2c7337a7dd7020f19aeee595dd2d803ee8d6d7e7c b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/9a/80d109580bc466f81cddb2c7337a7dd7020f19aeee595dd2d803ee8d6d7e7c deleted file mode 100644 index e9a6d48..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/9a/80d109580bc466f81cddb2c7337a7dd7020f19aeee595dd2d803ee8d6d7e7c +++ /dev/null @@ -1,306 +0,0 @@ -I"’\This is a continuation of my previous post where I walked through how to download and modify shape data. I also showed how to shift Alaska and Hawaii so they are closer to the continental usa. -
- -In this post, I’ll go over how to use Leaflet to map the shapefile we made in the previous post. If you’ve come here from part one of the series, you probably have the libraries and data loaded already. However, if you don’t, be sure to load the libraries and shapefiles before moving to number two.
- -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-
## load data
- states <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
to reflect wherever you saved the shifted shapefile.
If your data processing and base map creation are in the same file, you can skip this line, and when you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -At its most basic, all Leaflet needs to create a map is a base map and data layers. The code below may look intimidating, but it’s mostly style options.
- -This is the map we’re going to create. It’s a simple grey map and each state darkens in color as you hover over it. I’ll show the same map after each style option is added so you can see what effect it has.
- - - -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-
## create usa base map using leaflet()
- map <- leaflet() %>%
- addPolygons(data = states,
- smoothFactor = 0.2,
- fillColor = "#808080",
- fillOpacity = 0.5,
- stroke = TRUE,
- weight = 0.5,
- opacity = 0.5,
- color = "#808080",
- highlight = highlightOptions(
- weight = 0.5,
- color = "#000000",
- fillOpacity = 0.7,
- bringToFront = FALSE),
- group = "Base Map")
-
leaflet()
initializes the map widget. I save it to a variable called map (map <-
) so I can run other code in the file without recreating the map each time. When you want to see the map, you can type map
(or whatever you want to name your map) in the terminal and hit enter. R will display the map in the viewer.
addPolygons()
adds a layer to the map widget. Leaflet has different layer options, including addTiles
and addMarkers
which do different things. You can read about them on the leaflet website. Since we’re using a previously created shapefile, we’ll add the shapefile to the map using addPolygons()
.
The first argument you need to specify after calling addPolygons is data = [data-source]
. [data-source]
is whatever variable your data is stored in. For me, it’s called states
. This is either the processed data from part I of this series or the saved shapefile loaded above under the section called load data.
When you run only the first two lines, Leaflet will use its default styling. The base color will be a light blue and the outlines of the states will be dark blue and fairly thick.
- - - -You can leave the base map like this if you want, but all additional data will be added as a layer on top</i>* of this map which can become distracting very quickly. I prefer to make my base maps as basic and unobtrusive as possible so the data I add on top of the base map is more prominent.
- -smoothFactor
controls how much the polygon shape should be smoothed at each zoom level. The lower the number the more accurate your shapes will be. A larger number, on the other hand, will lead to better performance, but can distort the shapes of known areas.
I keep the smoothFactor
low because I want the United States to appear as a coherent land mass. The image below shows three different maps, each with a different smoothFactor to illustrate what this argument does. On the left, the map’s smoothFactor=0.2
, the center map’s smoothFactor=10
, and the right’s smoothFactor=100
.
As you can see, the higher the smoothFactor
the less coherent the United States becomes.
addPolygons()
.
-fillColor
refers to what color is on the inside of the polygons. Since I want a minimal base map, I usually set this value to be some shade of grey. If you want a different color, you only need to replace #808080
with the corresponding hex code for the color you want. Here is a useful hex color picker. If you have a hex value and you want the same color in a different shade, this is a useful site.
fillOpacity
determines how transparent the color inside the shape should be. I set mine to be 0.5
because I like the way it looks. The number can be between 0 and 1 with 1 being fully opaque and 0 being fully transparent.
The next four lines define the appearance of the shapes’ outline.
- -The stroke
property can be set to either TRUE
or FALSE
. When true, Leaflet adds an outline around each polygon. When false, the polygons have no outline. In the image below, the map on the left has the default outlines and on the right stroke = FALSE
.
weight = 0.5
sets the thickness of the outlines to be 0.5 pixels. This can be any value you want with higher numbers corresponding to thicker lines. Lower numbers correspond to thinner lines.
The opacity
property operates in the same way as fill opacity above, but on the outlines. The number can be between 0 and 1. Lower numbers correspond to the lines being more transparent and 1 means fully opaque.
color = "#808080"
sets the color of the outline. I typically set it to be the same color as the fill color.
If you want a static base map then lines 2-10 are all you need, as shown in the image below. I like to add some functionality to my base map so that the individual states become darker when they’re hovered over.
- - - -Lines 11-15 define the map’s behavior when the mouse hovers over the shape. Most of the options are the same as the ones used on the base polygon shapes, so I won’t go into them with much detail.
- -highlight = highlightOptions()
contains the mouseover specifications. The word before the equal sign has to be either highlight
or highlightOptions
. I am not sure why you have to declare highlight twice, but you do.
highlightOptions()
is the actual function call.
weight
, color
, and fillOpacity
all operate in the same way as before, but whatever values you specify here will only show up when the mouse hovers over.
bringToFront
takes one of two values: TRUE
or FALSE
. It only really matters when you have multiple layers (like we will in later parts of this series). When bringToFront = TRUE
hovering over the state will bring it to the front. When bringToFront = FALSE
it will stay in the back.
Since the base map has only one layer, this property doesn’t affect anything.
- -group = "Base Map")
lets you group multiple layers together. This argument will come in handy as we add more information to the map. The base map is the default layer and is always visible - though, when you use map tiles you can define multiple base layers. All other layers will be on top of the base layer. When using different groups, you can define functionality that allows users to turn off certain layers.
You’ve created your first base map! It’s a boring flat, grey map, but it’s the base we’ll use when adding in the national and state park data. In part III of this series we’ll process and add in the National Parks.
- - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/9c/e82b25b3111f2efecad4d980ce1ff3bcbe0eeec53504f56bf7e8e6dbf822e1 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/9c/e82b25b3111f2efecad4d980ce1ff3bcbe0eeec53504f56bf7e8e6dbf822e1 deleted file mode 100644 index 9267bbb..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/9c/e82b25b3111f2efecad4d980ce1ff3bcbe0eeec53504f56bf7e8e6dbf822e1 +++ /dev/null @@ -1,562 +0,0 @@ -I"ĚWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == " Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/9c/fe46a5db861bc21c0996d7f247b81e8b6f6ef95543271a9092265410edcbf6 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/9c/fe46a5db861bc21c0996d7f247b81e8b6f6ef95543271a9092265410edcbf6 deleted file mode 100644 index 2a60048..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/9c/fe46a5db861bc21c0996d7f247b81e8b6f6ef95543271a9092265410edcbf6 +++ /dev/null @@ -1,638 +0,0 @@ -I"ńWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-6
-7
-
## split the data by state
- split_states <- split(state_parks, f = state_parks$State_Nm) # split the data by state
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))
- }
-
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The basic logic of a for loop is:
- for(x in y){
- do something}
Inside the parenthesis is the condition that must evaluate to TRUE if the content in the curly braces is to run.
- -In line 4, for(name in all_names){
says as long as there’s a name in the list of all names, do whatever is inside the curly braces. name
can be whatever you want. It’s a placeholder value. I can have it say for(dogs in all_names){
it will still do the exact same thing. A lot of time you’ll see it as an i
for item. I like to use more descriptive language because, again, for loops are my Achilles’ heel.
The all_names
part is where ever you want R to look for the data. It will change based on your data set and variable naming conventions.
In line 5, I save the split data sets.
- -st_write()
is part of the sf package which allows us to create shapefiles. This can be any saving function (eg. write_csv() if you want to save CSVs). The function takes several arguments. In line 43 above I showed the basic structure: st_write(data, path/to/file.shp). This is good if you only have one file, but since I’m saving them in a loop I don’t want all of the files to have the same name. R will error out after the first and tell you the file already exists.
The first part split_states[[name]]
is still telling R what data to save, but using an index instead of a specific data frame name. To access an index you use data[[some-value]]
where some-value
is the index location. In my code, R will take the split_states
data and go alright the first index location in [[name]]
is 1 and return whatever value is stored in that index (here, AK). It will then do that for every index location as it loops through the split_states
data.
paste0()
is also part of base R - it’s apparently faster than paste()
. It concatenates (or links together) different pieces into one. I’m using it to create the filename. Within the paste0
call anything within quotation marks is static. So every file will be saved to "shapefiles/shifted/states/individual/"
and every file will have the extension .shp
. What will change with each loop is the name
of the file. One by one, R will loop through and save each file using the name
it pulled from all_names
.
st_write()
automatically creates the other three files that each “shapefile” needs. When the loop is done, you should have a folder of 200 files (50 states * 4 files each). Which is why I strongly recommend using DVC if you’re doing any kind of version control.
That’s all the processing done for the state files… for now. In part VI I’ll return to the states to create each state’s own map. Next up, in part V, I’m going back to my base map with the National Parks to add in some informational tool tips and interactivity.
- -* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
This is part three of my cartography in R series. If you are just finding this, I suggest taking a look at part I and part II first.
- -In this post, I will download and process the National Park data. Once that’s done, I’ll add it to the base map I created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-
## load data
- states <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
to reflect wherever you saved the shifted shapefile.
If your data processing and base map creation are in the same file, you can skip this line, and when you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -The National Park Service provides all the data we’ll need to make the map. The data is accessible on the ArcGIS’ Open Data website. Once you click on the link you’ll see a bunch of icons that lead to different data that’s available for download. Click on the one for boundaries.
- - - -From here, you’ll be taken to a list of available National Park data. The second link should be nps boundary which contains the shape data for all the National Parks in the United States. The file contains all the data for the park outlines along with hiking trails, rest areas, and lots of other data.
- - - -The nps boundary link will take you to a map showing the national parks. On the left, there will be a download link on the left.
- - - -From here, you’ll have a few download options. The National Park Service provides the data in different formats including CSV and Shapefile. You’ll want to download the shapefile version.
- - - -Be sure to save the file somewhere on your hard drive that is easy to find. When it finishes downloading, be sure to unzip the file. There will be four files inside the folder. All of them need to be kept in the same location. Even though we’ll only load the .shp
file, R uses the three others to create the necessary shapes.
The code below may look intimidating, but it’s fairly straight forward. I’ll go over each line below.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- ## load and process nps data
- nps <- read_sf("./shapefiles/original/nps/NPS_-_Land_Resources_Division_Boundary_and_Tract_Data_Service.shp") %>%
- select(STATE, UNIT_TYPE, PARKNAME, Shape__Are, geometry) %>%
- filter(STATE %!in% territories) %>%
- mutate(type = case_when(UNIT_TYPE == "International Historic Site" ~ "International Historic Site", # there's 23 types of national park, I wanted to reduce this number.
- UNIT_TYPE == "National Battlefield Site" ~ "National Military or Battlefield", # lines 56-77 reduce the number of park types
- UNIT_TYPE == "National Military Park" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Battlefield" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Historical Park" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Site" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Trail" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Memorial" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Monument" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Preserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Reserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Recreation Area" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Lakeshore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Wild & Scenic River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Seashore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Trails Syste" ~ "National Trail",
- UNIT_TYPE == "National Scenic Trail" ~ "National Trail",
- UNIT_TYPE == "National Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Parkway" ~ "National Park or Parkway",
- UNIT_TYPE == "Other Designation" ~ "Other National Land Area")) %>%
- mutate(visited = case_when(PARKNAME == "Joshua Tree" ~ "visited",
- PARKNAME == "Redwood" ~ "visited",
- PARKNAME == "Santa Monica Mountains" ~ "visited",
- PARKNAME == "Sequoia" ~ "visited",
- PARKNAME == "Kings Canyon" ~ "visited",
- PARKNAME == "Lewis and Clark" ~ "visited",
- PARKNAME == "Mount Rainier" ~ "visited",
- TRUE ~ "not visited")) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted national park data
- st_write(nps, "~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
In part I of this series I talked about how R has an %in%
function, but not a %!in%
function. Here’s where the latter function shines.
The United States is still an empire with its associated territories and islands. In this project I am interested in the 50 states - without these other areas. As a result, I need to filter them out. Using base R’s %in%
function I would have to create a variable that contains the postal abbreviations for all 50 states. That is annoying. Instead, I want to use the shorter list that only includes the US’ associated islands and territories. To do so, however, I need to use the operator tools’ %!in%
function.
Line 2 creates the list of US territories that I filter out in line 7. The c()
function in R means combine or concatenate. Inside the parenthesis are the five postal codes for the American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, and the Virgin Islands.
nps <- read_sf("path/to/file.shp")
loads the National Park data set to a variable called nps
using the read_sf()
function that is part of the sf package. You will need to change the file path so it reflects where you saved the data on your hard drive.
The %>%
operator is part of the tidyverse package. It tells R to go to the next line and process the next command. It has to go at the end of a line, rather than the beginning.
select
is part of the tidyverse package. With it, we can select columns by their name rather than their associated number. Large data sets take more computing power because the computer has to iterate over more rows. Unfortunately, rendering maps also takes a lot of computing power so I like to discard any unnecessary columns to reduce the amount of effort my computer has to exert.
Deciding on which columns to keep will depend on the data you’re using and what you want to map (or analyze). I know for my project I want to include a few things:
-There’s a couple ways to inspect the data to see what kind of information is available.
- -view(nps)
but as the number of data points increases, so does R’s struggle with opening it. I’ve found that VSCode doesn’t throw as big of a fit as R Studio when opening large data sets.data.frame(colnames(nps))
. This will return a list of the data set’s column names. This is my preferred method. I then go to the documentation to see what each column contains. This isn’t fool-proof because it really depends on if the data has good documentation.The National Park data includes a lot of information about who created the data and maintains the property. I’m not interested in this, so in line 6 I select the following columns:
-The geometry column is specific to shapefiles and it includes the coordinates of the shape. It will be kept automatically - unless you use the st_drop_geometry()
function. I like to specifically select so I remember it’s there.
In line 7 I use the territories list I created in line 2 to filter out the United States’ associated areas. Since the nps data uses the two character state abbreviation, I have to use the two character abbreviation for the territories. Searching for “Guam,” for example, won’t work.
- -filter()
is part of the tidyverse and it uses conditional language. In the parentheses is a condition that must be true if the tidyverse is going to keep the row. Starting at the top of the data, R goes “alright, does the value in the STATE column match any of the values in the territories list?” If the condition is TRUE, R adds the row to the new data frame.
%!in%
operator, any row that evaluates as TRUE will be kept because the value is NOT found in the territories list. If I wanted to keep only the territories, I would use the %in%
operator and only the rows with STATE abbreviations found in the territories list would be kept. For example, if the STATE value in row 1 is CA, filter looks at it and goes “is CA NOT IN territories?” If that is TRUE, keep it because we want only the values that are NOT IN the territories list.
- -mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
The NPS data set has 23 different types of National Parks listed (you can view all of them by running levels(as.factor(nps$UNIT_TYPE))
). I know that in later posts, I’m going to color code the land by type (blue for rivers, green for national parks, etc) so I wanted to reduce the number of colors I would have to use.
mutate()
’s first argument, type =
creates a new column called type
. R will populate the newly created column with whatever comes after the first (singular) equal =
sign. For example, I can put type = NA
and every row in the column will say NA
.
Here, I am using the case_when()
function, which is also part of the tidyverse. The logic of case_when
is fairly straight forward. The first value is the name of the column you want R to look in (here: UNIT_TYPE
). Next, is a conditional. Here I am looking for an exact match (==
) to the string (words) inside the first set of quotation marks (in line 8: "International Historic Site"
). The last part of the argument is what I want R to put in the type
column when it finds a row where the UNIT_TYPE
is "International Historic Site"
.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
Lines 9-29 do the same thing for the other park types. You can reduce the parks however you want or use all 23 types. Just remember that the value before the tilde ~
has to match the values found in the data exactly. For example, in line 24 I change the NPS data’s National Trail Syste value to be National Trail. Whomever created the data set did not spell system correctly, so for R to match the value I also have to omit the last letter in system.
Lines 30-37 use the same mutate()
and case_when
logic as above. Instead of reducing the number of park types, I use it to mark the different parks I have visited.
Line 30 creates the new column, visited
and uses case_when
to look for the names of the parks that I’ve been to. If I have visited them, it adds visited
to the column of the same name.
The last line, TRUE ~ "not_visited))
, acts as an else statement. For any park not listed above, it will put not visited
in the visited
column I created.
This feels like a very brute-force method of tracking which parks I’ve visited, but I haven’t spend much time trying to find another way.
- -In part I, when I made the base map, I moved Alaska and Hawaii so they were of similar size and closer to the continental USA. For the map to display the parks correctly, I have to shift them as well.
- -I went over these two lines in part II, so I won’t go over them again here. If you want to read more about them, check out that post.
- -The last line uses the st_transform()
function from the sf package to covert the data set from NAD83 to WGS84. Leaflet requires WGS84, so be sure to include this line at the end of your data manipulation.
I covered the WGS84 ellipsoid in part I, if you want to read more about it.
- -Strictly speaking, this line isn’t necessary. You can do all your data processing in the same file where you make your map, but I prefer to separate the steps into different files.
- -As a result, I save the shifted data to my hard drive so it’s easier to load later. I usually have this line commented out (by placing #
at the start of the line) after I save it the first time. I don’t want it to save every time I run the rest of the code.
1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-
## create usa Base Map using leaflet()
- map <- leaflet() %>%
- addPolygons(data = states,
- smoothFactor = 0.2,
- fillColor = "#808080",
- fillOpacity = 0.5,
- stroke = TRUE,
- weight = 0.5,
- opacity = 0.5,
- color = "#808080",
- highlight = highlightOptions(
- weight = 0.5,
- color = "#000000",
- fillOpacity = 0.7,
- bringToFront = FALSE),
- group = "Base Map") %>%
- addPolygons(data = nps,
- smoothFactor = 0.2,
- fillColor = "#354f52",
- fillOpacity = 1,
- stroke = TRUE,
- weight = 1,
- opacity = 0.5,
- color = "#354f52",
- highlight = highlightOptions(
- weight = 3,
- color = "#fff",
- fillOpacity = 0.8,
- bringToFront = TRUE),
- group = "National Parks") %>%
- addLayersControl(
- baseGroups = "Base Map",
- overlayGroups = "National Parks",
- options = layersControlOptions(collapsed = FALSE))
-
Lines 2-16 are identical to those in part II where I created the base map. I am not going to cover these sections in detail, because I covered it previously.
- -To add the National Park data to the base map, we call addPolygons()
again. The arguments are the same as before - color, opacity, outline style - just with different values. By changing those values, we can differentiate the base map from the national park data.
Since we’re mapping the National Parks and not the states, we have to tell R where the data is located using data = nps
.
smoothFactor()
determines how detailed the park boundaries should be. The lower the number, the more detailed the shape. The higher the number, the smoother the parks will render. I usually match this to whatever I set for the base map for consistency.
Define the color and transparency of the National Parks. In a future post, I am going to change the color of each type of public land, but for now, I’ll make them all a nice sage green color #354f52
. I also want to make the parks to be fully opaque.
The next four lines (21-24) define what kind of outline the National Parks will have. I detail each of these arguments in part II of this series.
- -Briefly, I want there to be an outline to each park (stroke = TRUE
) that’s thicker weight = 1
than the outline used on the base map. I do not like the way it looks at full opacity, so I make it half-transparent (opacity = 0.5
). Finally, I want the outline color = "#354f52
to be the same color as the fill. This will matter more when I change the fill color of the parks later on.
Lines 25-28 define the National Park’s behavior on mouseover. First we have to define and initialize the highlightOptions()
function. The function take similar arguments as the addPolygons
function - both of which I go over in detail in part II.
I want to keep the mouseover behavior noticeable, but simple. To do so, I set the outline’s thickness to be weight = 3
. This will give the shape a nice border that differentiates it from the rest of the map.
color = "#fff
sets the outline’s color on mouseover only. So, when inactive, the outline color will match the fill color, but on mouseover the outline color switches to white (#fff
).
bringToFront
can either be TRUE
or FALSE
. If TRUE
, Leaflet will bring the park to the forefront on mouseover. This is useful later when we add in the state parks because national and state parks tend to be close together.
When FALSE
the shape will remain static.
Since Leaflet adds all new data to the top of the base map, I think it’s useful to group the layers together. In the next block of code, we add in some layer functionality. For now, though, I want to add the National Parks to their own group so I can hide the National Parks if I want.
- -addLayersControl
defines how layers are displayed on the final map. The function takes three arguments.
First, we have to tell Leaflet which layer should be used as the base map: baseGroups = "Base Map"
. The name in the quotations (here: "Base Map"
) has to match the name given to the layer you set in the addPolygons()
call. In line 14, I put the 50 states into a group called "Base Map"
, but you can name it anything you like.
There can be more than one base map, too. It’s not super helpful here since I shifted Alaska and Hawaii, but when using map tiles you can add multiple types of base maps that users can switch between.
- -Next, we have to define the layers that are shown on top of the base group: overlayGroups = "National Parks"
. Just like the base map, this is defined in the corresponding addPolygons
call. Here, I called the layer National Parks
in line 30.
Finally, on the map I don’t want the layers to be collapsed, so I set options = layersControlOptions(collapsed = FALSE)
. When TRUE
the map will display an icon in the top right that, when clicked, will show the available layers.
Hey, look at that! You made a base map and you added some National Park data to it. You’re a certified cartographer now!
- -In the next part IV post we’ll download and process the state park data before adding it to the map. Part V of this series we’ll add Shiny functionality and some additional markers.
- - -</figure>
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/a3/241ac7d82a37476b5b317d45db535d57d52603968fc785ff6de752973ed5fe b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/a3/241ac7d82a37476b5b317d45db535d57d52603968fc785ff6de752973ed5fe deleted file mode 100644 index 2c8ec14..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/a3/241ac7d82a37476b5b317d45db535d57d52603968fc785ff6de752973ed5fe +++ /dev/null @@ -1,251 +0,0 @@ -I"ůCWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management and the Bureau of Reclamation. Having visited the park, I can tell you there’s no fences blocking these areas off. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area. It will be a good test case to make sure I’m selecting the correct data.
- -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/a4/fbfa5eaec02d5b5448b303c888653bd58dad21516f549d9773ff86ab66a570 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/a4/fbfa5eaec02d5b5448b303c888653bd58dad21516f549d9773ff86ab66a570 deleted file mode 100644 index c006fe3..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/a4/fbfa5eaec02d5b5448b303c888653bd58dad21516f549d9773ff86ab66a570 +++ /dev/null @@ -1,646 +0,0 @@ -I"‹ńWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm) # split the data by state
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Twitter is a great resource for engaging with the academic community. For example, I saw this Tweet by PhD Genie asking users to name one positive skill learned during their PhD. I love this question for a number of reasons. First, it helps PhDs reframe their experience so it’s applicable outside of academia - which can help when applying to jobs. Second, it’s really cool to see what skills other people have learned during their program.
-
I responded to the tweet because during my PhD I learned how to create maps in R. I started by recreating a map from the University of North Carolina’s Hussman School of Journalism’s News Deserts project (below). Now, I am working on a personal project mapping the U.S. National and State parks.
- - - -There was quite a bit of interest in how to do this, so in this series of posts I will document my process from start to finish.
- -First, I’m not an expert. I wanted to make a map, so I learned how. There may be easier ways and, if I learn how to do them, I’ll write another post.
- -Second, before starting, I strongly suggest setting up a Github and DVC. I wrote about how to use GitHub, the Github Website, and Github Desktop. You can use any of these methods to manage your repositories. I use all three based purely on whatever mood I’m in.
- -If you do use Git or GitHub, then DVC (data version control) is mandatory. GitHub will warn you that your file is too large if it’s over 50MB and reject your pushes if the files are over 100MB. The total repository size can’t exceed 2GB if you’re using the free version (which I am). DVC is useful because cartography files are large. They contain a lot of coordinates which increases with each location you try to map. DVC will store your data outside of GitHub but allows you to track changes with your data. It’s super useful.
- -Third, there are several ways to make a map. R is capable of making interactive maps and static maps. Static maps are less computationally expensive and better for publication. Interactive maps are prettier and better for displaying on the web.
- -I make interactive maps with Leaflet and Shiny because they offer a lot of functionality. The most common way is to use map tiles. Map tiles use data from sources like Open Street Map and Maps to create map squares (tiles) with custom data on top. A list of available map tiles is available on the Open Street Maps website.
- - - -When I make static maps (like the US map pictured above), I use ggplot
- -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-You only need to install the packages once. You can do so by running each line in the terminal. When you rerun the code later, you can skip right to loading the packages using library("package-name")
1
-2
-3
-4
-5
-6
-7
-8
-
## you only need to install the packages once
-
- install.packages("leaflet") # interactive maps
- install.packages("shiny") # added map functionality
- install.packages("tidyverse") # data manipulation
- install.packages("tigris") # cartographic boundaries
- install.packages("operator.tools") # for the not-in function
- install.packages("sf") # read and write shapefiles
-
leaflet()
)addTiles()
, addPolygons()
, or addMarkers()
.
- The Tidyverse is a collection of packages used for data manipulation and analysis. Its syntax is more intuitive than base R. Furthermore, you can chain (aka pipe) commands together.
- -For cartography, you don’t need the whole Tidyverse. We’ll mainly use dplyr
and ggplot
. You can install these packages individually instead of installing the whole tidyverse. Though, when we get to the national park database, we’ll also need purr
and tidyr
.
operator.tools is not required, but it’s recommended.
- -For some unknown reason, base R has a %in%
function but not a not-in
function. Unfortunately, the United States is still an empire with it’s associated areas, islands, and pseudo-states. I only want to include the 50 states, so I needed a way to easily filter out the non-states. Operator tool’s %!!in%
function is perfect for that.
To start, create and save a new file called usa.r
. In it, we’re going to download and modify the United States shape data that we’ll use to create the base map in part two of this series.
At the beginning of each file, you have to load the necessary packages. In this file, the only packages we need to load are tidyverse, sf, and tigris. I also load leaflet to make sure the map renders correctly.
- -1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse")
- library("sf")
- library("tigris")
- library("leaflet")
-
There’s two ways to download the USA shape data. First, we can use the R package, tigris. Second, we can download it from the Census website.
- -I prefer using tigris but I’ve been having some problems with it. Sometimes it ignores the Great Lakes and merges Michigan and Wisconsin into a Frankenstate (boxed in red below).
- - - -tigris()
downloads the TIGER/Shapefile data directly from the Census and includes a treasure trove of data. Some of the data includes land area, water area, state names, and geometry.
Tigris can also download boundaries for counties, divisions, regions, tracts, blocks, congressional and school districts, and a whole host of other groupings. A complete list of available data can be found on the packages’ GitHub.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-
## download state data using tigris()
- us_states <- tigris::states(cb = FALSE, year = 2020) %>%
- filter(STATEFP < 57) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted shapefile
- st_write(us_states, "path/to/file/usa.shp")
-
Here we create the us_states
variable, save the geographic data to it, move Alaska and Hawaii so they’re beneath the continental US, and save the shifted shapefile.
R uses the <-
operator to define new variables. Here, we’re naming our new variable us_states
.
In our us_states
variable we’re going to store data on the 50 states downloaded using tigris
. Within (::
) tigris, we’re going to use the states()
function.
The states()
function allows you to pull state-level data from the Census. This function takes several arguments
The cb
argument can either be TRUE
or FALSE
. If cb = FALSE
tells Tigris() to download the most detailed shapefile. If cb = TRUE
it will download a generalized (1:5000k) file. After a lot of trial and error, I found that using cb = TRUE
prevents the Frankenstate from happening.
If the year
argument is omitted it will download the shapefile for the default year (currently 2020). I set out of habit from when I work with county boundaries. When I work with county boundaries I have to set the year because their boundaries change more than states.
Finally, the %>%
operator is part of the Tidyverse. It basically tells R “Hey! I’m not done, keep going to the next line!”
tigris::states()
downloads data for the 50 states and the United States’ minor outlying islands, Puerto Rico, and its associated territories. Each state and territory is assigned a unique two-digit Federal Information Processing Standard [FIPS] code.
They’re mostly consecutive (Alaska is 01) but when they were conceived of in the 1970s a couple were reserved for the US territories (American Samoa was 03), but in the updated version the “reserved codes” were left out and the territories were assigned to new numbers (American Samoa is now 60). The important bit about this is that the last official state (Wyoming) has a FIPS of 56.
- -This line of code uses the filter()
function on the STATEFP
variable downloaded using Tigris(). All it says is keep any row that has a FIPS of less than 57. This will keep only the 50 states and exclude the United States’ empire associated territories.
The shift_geometry()
is from the Tigris package. It takes two arguments preserve_area
and position
.
When preserve_area = FALSE
tigris will shrink Alaska’s size and increase Hawaii’s so that they are comparable to the size of the other states.
The position
argument can either be "below"
or "outside"
. When it’s below
, both Alaska and Hawaii are moved to be below California. When it’s outside
then Alaska is moved to be near Washington and Hawaii is moved to be near California.
Since I’m a born-theorist, I should warn you that messing with maps has inherent normative implications. The most common projection is Mercator which stretches the continents near the poles and squishes the ones near the equator.
- - - -One of the competing projections is Gall-Peters which claims to be more accurate because it was - at the time it was created in the 1980s - the only “area-correct map.” Though it has now been criticized for skewing the polar continents and the equatorial ones. The above photo shows you just how different the projects are from one another.
- -The problem arises because we’re trying to project a 3D object into 2D space. It’s a classic case of even though we can, maybe we shouldn’t. Computers can do these computations and change the projections to anything we want fairly easily. However, humans think and exist in metaphors. We assume bigger = better and up = good. When we project maps that puts the Northern Hemisphere as both upwards and larger than other parts of the world we are imbuing that projection with metaphorical meaning.
- -I caution you to be careful when creating maps. Think through the implications of something as simple as making Alaska more visually appealing by distorting it to be of similar size as the other states.
- -If you want to read more about map projections this is a good post. If you want to read more about metaphors, I suggest Metaphors We Live By by George Lakoff and Mark Johnson.
- -The sf
package includes a function called st_transform()
which will reproject the data for us. There are a lot of projects. You can read them at the proj website.
Leaflet requires all boundaries use the World Geodetic Service 1984 (WGS84) coordinate system. While making maps I’ve come across two main coordinate systems: WGS84 and North American Datum (1983). WGS84 uses the WGS84 ellipsoid and NAD83 uses the Geodetic Reference System (GRS80). From what I’ve gathered, the differences are slight, but leaflet requires WGS and the Census uses NAD83. As a result, we have to reproject the the data in order to make our map.
- -The st_transform
function takes four arguments, each preceded by a +
. All four arguments are required to transform the data from NAD83 to WGS84.
Briefly, +proj=longlat
tells R to use project the code into longitude and latitude [rather than, for example, transverse mercator (tmerc
)].
+ellps=WGS84
sets the ellipsoid to the WGS84 standard.
+datum=WGS84
is a holdover from previous proj releases. It tells R to use the WGS84 data.
+no_defs
is also a holdover.
Essentially, you need to include line 6 before you create the map, but after you do any data manipulation. It might throw some warnings which you can just ignore.
- -In the last line, we save the data we manipulated in lines 2-6. Strictly speaking you don’t have to save the shapefile. You can manipulate the data and then skip right to mapping the data. I caution against it because the files can get unreadable once you start using multiple data sets. I usually comment out line 9 after I save the file. That way I’m not saving and re-saving it whenever I need to run the code above it.
- -The st_write()
function is part of the sf
package and it takes two arguments. The first is the data set you want to save. Since I used us_states
to save the data, it will be the first argument in the st_write()
function call.
The second argument is the path to where you want the file saved and what name you want to give it. I named mine usa
. It is mandatory that you add .shp
to the end of the filepath so that R knows to save it as a shapefile.
Although it’s called a shapefile, it’s actually four files. I usually create a separate folder for each set of shapefiles and store that in one master folder called shapefiles. An example of my folder structure is below. I keep all of this in my GitHub repo and track changes using DVC.
- - - -On my C://
drive is My Documents
. In that folder I keep a GitHub
folder that holds all my repos, including my nps
one. Inside the nps
folder I separate my shapefiles into their own folder. For this tutorial I am using original and shifted shapefiles, so I’ve also separated them into two separate folders to keep things neat. I also know I’m going to have multiple shapefiles (one for the USA, one for the National Parks, and a final one for the State Parks) so I created a folder for each set. In the usa
folder I saved the shifted states shapefile.
Altogether, my line 9 would read:
- - - -Running that line will save the four necessary files that R needs to load the geographic data.
- -That’s it for method 1 using tigris
. The next section, method 2, shows how to load and transform a previously downloaded shapefile. If you used method 1, feel free to leave this post and go directly to mapping the shapefile in part II of this series.
In this section, I’ll go through the process of downloading the shapefiles from the Census website. If you tried method 1 and tigris caused the weird Frankenstate, you can try using the data downloaded from the Census website. I don’t know why it works, since tigris uses the same data, but it does.
- -Generally, though, finding and using shapefiles created by others is a great way to create cool maps. There are thousands of shapefiles available, many from ArcGis’ Open Data Website.
- -Save the file wherever you want, but I prefer to keep it within the “original” shapefiles folder in a sub-folder called “zips.” Once it downloads, unzip it - again, anywhere is fine. It will download all 30 Census shapefiles. We’re only going to use the one called “cb_2021_us_state_500k.zip”. The rest you can delete, if you want.
- - - -When you unzip the cb_2021_us_state_500k.zip, it will contain four files. You’ll only ever work with the .shp
file, but the other three are used in the background to display the data.
Once all the files are unzipped, we can load the .shp
file into R.
1
-2
-3
-4
-5
-6
-7
-8
-9
-
## load a previously downloaded shapefile
- usa <- read_sf("shapefiles/original/usa/states/cb_2021_us_state_500k.shp") %>%
- filter(STATEFP < 57) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted shapefile
- st_write(usa, "path/to/file/usa.shp")
-
Everything except line 2 is the same as in method 1. I won’t go over lines 3-9 here, because all the information is above.
- -This line is very similar to the one above. I changed the name of the variable to usa
so I could keep both methods in the same R file (each R variable needs to be unique or it will be overwritten).
read_sf
is part of the sf() package. It’s used to load shapefiles into R. The path to the file is enclosed in quotation marks and parentheses. Simply navigate to wherever you unzipped the cb_2021_us_state_500k file and choose the file with the .shp
extension.
Once the shapefiles are downloaded - either using tigris() or by loading the shapefiles from the Census website - you can create the base map. I’ll tackle making the base map in part II of this series.*
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/a8/0c4a320e4314e67d878a4cbf17de723fa15b8025c7213e4a7c6d4b44a8a37c b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/a8/0c4a320e4314e67d878a4cbf17de723fa15b8025c7213e4a7c6d4b44a8a37c deleted file mode 100644 index 82cce46..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/a8/0c4a320e4314e67d878a4cbf17de723fa15b8025c7213e4a7c6d4b44a8a37c +++ /dev/null @@ -1,686 +0,0 @@ -I"ĆţWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The for(x in y) {
- do something}
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
Twitter is a great resource for engaging with the academic community. For example, I saw this Tweet by PhD Genie asking users to name one positive skill learned during their PhD. I love this question for a number of reasons. First, it helps PhDs reframe their experience so it’s applicable outside of academia - which can help when applying to jobs. Second, it’s really cool to see what skills other people have learned during their program.
-
I responded to the tweet because during my PhD I learned how to create maps in R. I started by recreating a map from the University of North Carolina’s Hussman School of Journalism’s News Deserts project (below). Now, I am working on a personal project mapping the U.S. National and State parks.
- - - -There was quite a bit of interest in how to do this, so in this series of posts I will document my process from start to finish.
- -First, I’m not an expert. I wanted to make a map, so I learned how. There may be easier ways and, if I learn how to do them, I’ll write another post.
- -Second, before starting, I strongly suggest setting up a Github and DVC. I wrote about how to use GitHub, the Github Website, and Github Desktop. You can use any of these methods to manage your repositories. I use all three based purely on whatever mood I’m in.
- -If you do use Git or GitHub, then DVC (data version control) is mandatory. GitHub will warn you that your file is too large if it’s over 50MB and reject your pushes if the files are over 100MB. The total repository size can’t exceed 2GB if you’re using the free version (which I am). DVC is useful because cartography files are large. They contain a lot of coordinates which increases with each location you try to map. DVC will store your data outside of GitHub but allows you to track changes with your data. It’s super useful.
- -Third, there are several ways to make a map. R is capable of making interactive maps and static maps. Static maps are less computationally expensive and better for publication. Interactive maps are prettier and better for displaying on the web.
- -I make interactive maps with Leaflet and Shiny because they offer a lot of functionality. The most common way is to use map tiles. Map tiles use data from sources like Open Street Map and Maps to create map squares (tiles) with custom data on top. A list of available map tiles is available on the Open Street Maps website.
- - - -When I make static maps (like the US map pictured above), I use ggplot
- -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-You only need to install the packages once. You can do so by running each line in the terminal. When you rerun the code later, you can skip right to loading the packages using library("package-name")
1
-2
-3
-4
-5
-6
-7
-8
-
## you only need to install the packages once
-
- install.packages("leaflet") # interactive maps
- install.packages("shiny") # added map functionality
- install.packages("tidyverse") # data manipulation
- install.packages("tigris") # cartographic boundaries
- install.packages("operator.tools") # for the not-in function
- install.packages("sf") # read and write shapefiles
-
leaflet()
)addTiles()
, addPolygons()
, or addMarkers()
.
- The Tidyverse is a collection of packages used for data manipulation and analysis. Its syntax is more intuitive than base R. Furthermore, you can chain (aka pipe) commands together.
- -For cartography, you don’t need the whole Tidyverse. We’ll mainly use dplyr
and ggplot
. You can install these packages individually instead of installing the whole tidyverse. Though, when we get to the national park database, we’ll also need purr
and tidyr
.
operator.tools is not required, but it’s recommended.
- -For some unknown reason, base R has a %in%
function but not a not-in
function. Unfortunately, the United States is still an empire with it’s associated areas, islands, and pseudo-states. I only want to include the 50 states, so I needed a way to easily filter out the non-states. Operator tool’s %!!in%
function is perfect for that.
To start, create and save a new file called usa.r
. In it, we’re going to download and modify the United States shape data that we’ll use to create the base map in part two of this series.
At the beginning of each file, you have to load the necessary packages. In this file, the only packages we need to load are tidyverse, sf, and tigris. I also load leaflet to make sure the map renders correctly.
- -1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse")
- library("sf")
- library("tigris")
- library("leaflet")
-
There’s two ways to download the USA shape data. First, we can use the R package, tigris. Second, we can download it from the Census website.
- -I prefer using tigris but I’ve been having some problems with it. Sometimes it ignores the Great Lakes and merges Michigan and Wisconsin into a Frankenstate (boxed in red below).
- - - -tigris()
downloads the TIGER/Shapefile data directly from the Census and includes a treasure trove of data. Some of the data includes land area, water area, state names, and geometry.
Tigris can also download boundaries for counties, divisions, regions, tracts, blocks, congressional and school districts, and a whole host of other groupings. A complete list of available data can be found on the packages’ GitHub.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-
## download state data using tigris()
- us_states <- tigris::states(cb = FALSE, year = 2020) %>%
- filter(STATEFP < 57) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted shapefile
- st_write(us_states, "path/to/file/usa.shp")
-
Here we create the us_states
variable, save the geographic data to it, move Alaska and Hawaii so they’re beneath the continental US, and save the shifted shapefile.
R uses the <-
operator to define new variables. Here, we’re naming our new variable us_states
.
In our us_states
variable we’re going to store data on the 50 states downloaded using tigris
. Within (::
) tigris, we’re going to use the states()
function.
The states()
function allows you to pull state-level data from the Census. This function takes several arguments
The cb
argument can either be TRUE
or FALSE
. If cb = FALSE
tells Tigris() to download the most detailed shapefile. If cb = TRUE
it will download a generalized (1:5000k) file. After a lot of trial and error, I found that using cb = TRUE
prevents the Frankenstate from happening.
If the year
argument is omitted it will download the shapefile for the default year (currently 2020). I set out of habit from when I work with county boundaries. When I work with county boundaries I have to set the year because their boundaries change more than states.
Finally, the %>%
operator is part of the Tidyverse. It basically tells R “Hey! I’m not done, keep going to the next line!”
tigris::states()
downloads data for the 50 states and the United States’ minor outlying islands, Puerto Rico, and its associated territories. Each state and territory is assigned a unique two-digit Federal Information Processing Standard [FIPS] code.
They’re mostly consecutive (Alaska is 01) but when they were conceived of in the 1970s a couple were reserved for the US territories (American Samoa was 03), but in the updated version the “reserved codes” were left out and the territories were assigned to new numbers (American Samoa is now 60). The important bit about this is that the last official state (Wyoming) has a FIPS of 56.
- -This line of code uses the filter()
function on the STATEFP
variable downloaded using Tigris(). All it says is keep any row that has a FIPS of less than 57. This will keep only the 50 states and exclude the United States’ empire associated territories.
The shift_geometry()
is from the Tigris package. It takes two arguments preserve_area
and position
.
When preserve_area = FALSE
tigris will shrink Alaska’s size and increase Hawaii’s so that they are comparable to the size of the other states.
The position
argument can either be "below"
or "outside"
. When it’s below
, both Alaska and Hawaii are moved to be below California. When it’s outside
then Alaska is moved to be near Washington and Hawaii is moved to be near California.
Since I’m a born-theorist, I should warn you that messing with maps has inherent normative implications. The most common projection is Mercator which stretches the continents near the poles and squishes the ones near the equator.
- - - -One of the competing projections is Gall-Peters which claims to be more accurate because it was - at the time it was created in the 1980s - the only “area-correct map.” Though it has now been criticized for skewing the polar continents and the equatorial ones. The above photo shows you just how different the projects are from one another.
- -The problem arises because we’re trying to project a 3D object into 2D space. It’s a classic case of even though we can, maybe we shouldn’t. Computers can do these computations and change the projections to anything we want fairly easily. However, humans think and exist in metaphors. We assume bigger = better and up = good. When we project maps that puts the Northern Hemisphere as both upwards and larger than other parts of the world we are imbuing that projection with metaphorical meaning.
- -I caution you to be careful when creating maps. Think through the implications of something as simple as making Alaska more visually appealing by distorting it to be of similar size as the other states.
- -If you want to read more about map projections this is a good post. If you want to read more about metaphors, I suggest Metaphors We Live By by George Lakoff and Mark Johnson.
- -The sf
package includes a function called st_transform()
which will reproject the data for us. There are a lot of projects. You can read them at the proj website.
Leaflet requires all boundaries use the World Geodetic Service 1984 (WGS84) coordinate system. While making maps I’ve come across two main coordinate systems: WGS84 and North American Datum (1983). WGS84 uses the WGS84 ellipsoid and NAD83 uses the Geodetic Reference System (GRS80). From what I’ve gathered, the differences are slight, but leaflet requires WGS and the Census uses NAD83. As a result, we have to reproject the the data in order to make our map.
- -The st_transform
function takes four arguments, each preceded by a +
. All four arguments are required to transform the data from NAD83 to WGS84.
Briefly, +proj=longlat
tells R to use project the code into longitude and latitude [rather than, for example, transverse mercator (tmerc
)].
+ellps=WGS84
sets the ellipsoid to the WGS84 standard.
+datum=WGS84
is a holdover from previous proj releases. It tells R to use the WGS84 data.
+no_defs
is also a holdover.
Essentially, you need to include line 6 before you create the map, but after you do any data manipulation. It might throw some warnings which you can just ignore.
- -In the last line, we save the data we manipulated in lines 2-6. Strictly speaking you don’t have to save the shapefile. You can manipulate the data and then skip right to mapping the data. I caution against it because the files can get unreadable once you start using multiple data sets. I usually comment out line 9 after I save the file. That way I’m not saving and re-saving it whenever I need to run the code above it.
- -The st_write()
function is part of the sf
package and it takes two arguments. The first is the data set you want to save. Since I used us_states
to save the data, it will be the first argument in the st_write()
function call.
The second argument is the path to where you want the file saved and what name you want to give it. I named mine usa
. It is mandatory that you add .shp
to the end of the filepath so that R knows to save it as a shapefile.
Although it’s called a shapefile, it’s actually four files. I usually create a separate folder for each set of shapefiles and store that in one master folder called shapefiles. An example of my folder structure is below. I keep all of this in my GitHub repo and track changes using DVC.
- - - -On my C://
drive is My Documents
. In that folder I keep a GitHub
folder that holds all my repos, including my nps
one. Inside the nps
folder I separate my shapefiles into their own folder. For this tutorial I am using original and shifted shapefiles, so I’ve also separated them into two separate folders to keep things neat. I also know I’m going to have multiple shapefiles (one for the USA, one for the National Parks, and a final one for the State Parks) so I created a folder for each set. In the usa
folder I saved the shifted states shapefile.
Altogether, my line 9 would read:
- - - -Running that line will save the four necessary files that R needs to load the geographic data.
- -That’s it for method 1 using tigris
. The next section, method 2, shows how to load and transform a previously downloaded shapefile. If you used method 1, feel free to leave this post and go directly to mapping the shapefile in part II of this series.
In this section, I’ll go through the process of downloading the shapefiles from the Census website. If you tried method 1 and tigris caused the weird Frankenstate, you can try using the data downloaded from the Census website. I don’t know why it works, since tigris uses the same data, but it does.
- -Generally, though, finding and using shapefiles created by others is a great way to create cool maps. There are thousands of shapefiles available, many from ArcGis’ Open Data Website.
- -Save the file wherever you want, but I prefer to keep it within the “original” shapefiles folder in a sub-folder called “zips.” Once it downloads, unzip it - again, anywhere is fine. It will download all 30 Census shapefiles. We’re only going to use the one called “cb_2021_us_state_500k.zip”. The rest you can delete, if you want.
- - - -When you unzip the cb_2021_us_state_500k.zip, it will contain four files. You’ll only ever work with the .shp
file, but the other three are used in the background to display the data.
Once all the files are unzipped, we can load the .shp
file into R.
1
-2
-3
-4
-5
-6
-7
-8
-9
-
## load a previously downloaded shapefile
- usa <- read_sf("shapefiles/original/usa/states/cb_2021_us_state_500k.shp") %>%
- filter(STATEFP < 57) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted shapefile
- st_write(usa, "path/to/file/usa.shp")
-
Everything except line 2 is the same as in method 1. I won’t go over lines 3-9 here, because all the information is above.
- -This line is very similar to the one above. I changed the name of the variable to usa
so I could keep both methods in the same R file (each R variable needs to be unique or it will be overwritten).
read_sf
is part of the sf() package. It’s used to load shapefiles into R. The path to the file is enclosed in quotation marks and parentheses. Simply navigate to wherever you unzipped the cb_2021_us_state_500k file and choose the file with the .shp
extension.
Once the shapefiles are downloaded - either using tigris() or by loading the shapefiles from the Census website - you can create the base map. I’ll tackle making the base map in part II of this series.*
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/aa/e02f2fed1aed3fdc5c5f1df529d07469c5652c3f47cdd1dc8d4cca4a44a8d6 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/aa/e02f2fed1aed3fdc5c5f1df529d07469c5652c3f47cdd1dc8d4cca4a44a8d6 new file mode 100644 index 0000000..9c80bbe --- /dev/null +++ b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/aa/e02f2fed1aed3fdc5c5f1df529d07469c5652c3f47cdd1dc8d4cca4a44a8d6 @@ -0,0 +1,30 @@ +I"[Twitter is a great resource for engaging with the academic community. For example, I saw this Tweet by PhD Genie asking users to name one positive skill learned during their PhD. I love this question for a number of reasons. First, it helps PhDs reframe their experience so it’s applicable outside of academia - which can help when applying to jobs. Second, it’s really cool to see what skills other people have learned during their program.
-
I responded to the tweet because during my PhD I learned how to create maps in R. I started by recreating a map from the University of North Carolina’s Hussman School of Journalism’s News Deserts project (below). Now, I am working on a personal project mapping the U.S. National and State parks.
- - - -There was quite a bit of interest in how to do this, so in this series of posts I will document my process from start to finish.
- -First, I’m not an expert. I wanted to make a map, so I learned how. There may be easier ways and, if I learn how to do them, I’ll write another post.
- -Second, before starting, I strongly suggest setting up a Github and DVC. I wrote about how to use GitHub, the Github Website, and Github Desktop. You can use any of these methods to manage your repositories. I use all three based purely on whatever mood I’m in.
- -If you do use Git or GitHub, then DVC (data version control) is mandatory. GitHub will warn you that your file is too large if it’s over 50MB and reject your pushes if the files are over 100MB. The total repository size can’t exceed 2GB if you’re using the free version (which I am). DVC is useful because cartography files are large. They contain a lot of coordinates which increases with each location you try to map. DVC will store your data outside of GitHub but allows you to track changes with your data. It’s super useful.
- -Third, there are several ways to make a map. R is capable of making interactive maps and static maps. Static maps are less computationally expensive and better for publication. Interactive maps are prettier and better for displaying on the web.
- -I make interactive maps with Leaflet and Shiny because they offer a lot of functionality. The most common way is to use map tiles. Map tiles use data from sources like Open Street Map and Maps to create map squares (tiles) with custom data on top. A list of available map tiles is available on the Open Street Maps website.
- - - -When I make static maps (like the US map pictured above), I use ggplot
- -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-You only need to install the packages once. You can do so by running each line in the terminal. When you rerun the code later, you can skip right to loading the packages using library("package-name")
1
-2
-3
-4
-5
-6
-7
-8
-
## you only need to install the packages once
-
- install.packages("leaflet") # interactive maps
- install.packages("shiny") # added map functionality
- install.packages("tidyverse") # data manipulation
- install.packages("tigris") # cartographic boundaries
- install.packages("operator.tools") # for the not-in function
- install.packages("sf") # read and write shapefiles
-
leaflet()
)addTiles()
, addPolygons()
, or addMarkers()
.
- The Tidyverse is a collection of packages used for data manipulation and analysis. Its syntax is more intuitive than base R. Furthermore, you can chain (aka pipe) commands together.
- -For cartography, you don’t need the whole Tidyverse. We’ll mainly use dplyr
and ggplot
. You can install these packages individually instead of installing the whole tidyverse. Though, when we get to the national park database, we’ll also need purr
and tidyr
.
operator.tools is not required, but it’s recommended.
- -For some unknown reason, base R has a %in%
function but not a not-in
function. Unfortunately, the United States is still an empire with it’s associated areas, islands, and pseudo-states. I only want to include the 50 states, so I needed a way to easily filter out the non-states. Operator tool’s %!!in%
function is perfect for that.
To start, create and save a new file called usa.r
. In it, we’re going to download and modify the United States shape data that we’ll use to create the base map in part two of this series.
At the beginning of each file, you have to load the necessary packages. In this file, the only packages we need to load are tidyverse, sf, and tigris. I also load leaflet to make sure the map renders correctly.
- -1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse")
- library("sf")
- library("tigris")
- library("leaflet")
-
There’s two ways to download the USA shape data. First, we can use the R package, tigris. Second, we can download it from the Census website.
- -I prefer using tigris but I’ve been having some problems with it. Sometimes it ignores the Great Lakes and merges Michigan and Wisconsin into a Frankenstate (boxed in red below).
- - - -tigris()
downloads the TIGER/Shapefile data directly from the Census and includes a treasure trove of data. Some of the data includes land area, water area, state names, and geometry.
Tigris can also download boundaries for counties, divisions, regions, tracts, blocks, congressional and school districts, and a whole host of other groupings. A complete list of available data can be found on the packages’ GitHub.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-
## download state data using tigris()
- us_states <- tigris::states(cb = FALSE, year = 2020) %>%
- filter(STATEFP < 57) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted shapefile
- st_write(us_states, "path/to/file/usa.shp")
-
Here we create the us_states
variable, save the geographic data to it, move Alaska and Hawaii so they’re beneath the continental US, and save the shifted shapefile.
R uses the <-
operator to define new variables. Here, we’re naming our new variable us_states
.
In our us_states
variable we’re going to store data on the 50 states downloaded using tigris
. Within (::
) tigris, we’re going to use the states()
function.
The states()
function allows you to pull state-level data from the Census. This function takes several arguments
The cb
argument can either be TRUE
or FALSE
. If cb = FALSE
tells Tigris() to download the most detailed shapefile. If cb = TRUE
it will download a generalized (1:5000k) file. After a lot of trial and error, I found that using cb = TRUE
prevents the Frankenstate from happening.
If the year
argument is omitted it will download the shapefile for the default year (currently 2020). I set out of habit from when I work with county boundaries. When I work with county boundaries I have to set the year because their boundaries change more than states.
Finally, the %>%
operator is part of the Tidyverse. It basically tells R “Hey! I’m not done, keep going to the next line!”
tigris::states()
downloads data for the 50 states and the United States’ minor outlying islands, Puerto Rico, and its associated territories. Each state and territory is assigned a unique two-digit Federal Information Processing Standard [FIPS] code.
They’re mostly consecutive (Alaska is 01) but when they were conceived of in the 1970s a couple were reserved for the US territories (American Samoa was 03), but in the updated version the “reserved codes” were left out and the territories were assigned to new numbers (American Samoa is now 60). The important bit about this is that the last official state (Wyoming) has a FIPS of 56.
- -This line of code uses the filter()
function on the STATEFP
variable downloaded using Tigris(). All it says is keep any row that has a FIPS of less than 57. This will keep only the 50 states and exclude the United States’ empire associated territories.
The shift_geometry()
is from the Tigris package. It takes two arguments preserve_area
and position
.
When preserve_area = FALSE
tigris will shrink Alaska’s size and increase Hawaii’s so that they are comparable to the size of the other states.
The position
argument can either be "below"
or "outside"
. When it’s below
, both Alaska and Hawaii are moved to be below California. When it’s outside
then Alaska is moved to be near Washington and Hawaii is moved to be near California.
Since I’m a born-theorist, I should warn you that messing with maps has inherent normative implications. The most common projection is Mercator which stretches the continents near the poles and squishes the ones near the equator.
- - - -One of the competing projections is Gall-Peters which claims to be more accurate because it was - at the time it was created in the 1980s - the only “area-correct map.” Though it has now been criticized for skewing the polar continents and the equatorial ones. The above photo shows you just how different the projects are from one another.
- -The problem arises because we’re trying to project a 3D object into 2D space. It’s a classic case of even though we can, maybe we shouldn’t. Computers can do these computations and change the projections to anything we want fairly easily. However, humans think and exist in metaphors. We assume bigger = better and up = good. When we project maps that puts the Northern Hemisphere as both upwards and larger than other parts of the world we are imbuing that projection with metaphorical meaning.
- -I caution you to be careful when creating maps. Think through the implications of something as simple as making Alaska more visually appealing by distorting it to be of similar size as the other states.
- -If you want to read more about map projections this is a good post. If you want to read more about metaphors, I suggest Metaphors We Live By by George Lakoff and Mark Johnson.
- -The sf
package includes a function called st_transform()
which will reproject the data for us. There are a lot of projects. You can read them at the proj website.
Leaflet requires all boundaries use the World Geodetic Service 1984 (WGS84) coordinate system. While making maps I’ve come across two main coordinate systems: WGS84 and North American Datum (1983). WGS84 uses the WGS84 ellipsoid and NAD83 uses the Geodetic Reference System (GRS80). From what I’ve gathered, the differences are slight, but leaflet requires WGS and the Census uses NAD83. As a result, we have to reproject the the data in order to make our map.
- -The st_transform
function takes four arguments, each preceded by a +
. All four arguments are required to transform the data from NAD83 to WGS84.
Briefly, +proj=longlat
tells R to use project the code into longitude and latitude [rather than, for example, transverse mercator (tmerc
)].
+ellps=WGS84
sets the ellipsoid to the WGS84 standard.
+datum=WGS84
is a holdover from previous proj releases. It tells R to use the WGS84 data.
+no_defs
is also a holdover.
Essentially, you need to include line 6 before you create the map, but after you do any data manipulation. It might throw some warnings which you can just ignore.
- -In the last line, we save the data we manipulated in lines 2-6. Strictly speaking you don’t have to save the shapefile. You can manipulate the data and then skip right to mapping the data. I caution against it because the files can get unreadable once you start using multiple data sets. I usually comment out line 9 after I save the file. That way I’m not saving and re-saving it whenever I need to run the code above it.
- -The st_write()
function is part of the sf
package and it takes two arguments. The first is the data set you want to save. Since I used us_states
to save the data, it will be the first argument in the st_write()
function call.
The second argument is the path to where you want the file saved and what name you want to give it. I named mine usa
. It is mandatory that you add .shp
to the end of the filepath so that R knows to save it as a shapefile.
Although it’s called a shapefile, it’s actually four files. I usually create a separate folder for each set of shapefiles and store that in one master folder called shapefiles. An example of my folder structure is below. I keep all of this in my GitHub repo and track changes using DVC.
- - - -On my C://
drive is My Documents
. In that folder I keep a GitHub
folder that holds all my repos, including my nps
one. Inside the nps
folder I separate my shapefiles into their own folder. For this tutorial I am using original and shifted shapefiles, so I’ve also separated them into two separate folders to keep things neat. I also know I’m going to have multiple shapefiles (one for the USA, one for the National Parks, and a final one for the State Parks) so I created a folder for each set. In the usa
folder I saved the shifted states shapefile.
Altogether, my line 9 would read:
- - - -Running that line will save the four necessary files that R needs to load the geographic data.
- -That’s it for method 1 using tigris
. The next section, method 2, shows how to load and transform a previously downloaded shapefile. If you used method 1, feel free to leave this post and go directly to mapping the shapefile in part II of this series.
In this section, I’ll go through the process of downloading the shapefiles from the Census website. If you tried method 1 and tigris caused the weird Frankenstate, you can try using the data downloaded from the Census website. I don’t know why it works, since tigris uses the same data, but it does.
- -Generally, though, finding and using shapefiles created by others is a great way to create cool maps. There are thousands of shapefiles available, many from ArcGis’ Open Data Website.
- -Save the file wherever you want, but I prefer to keep it within the “original” shapefiles folder in a sub-folder called “zips.” Once it downloads, unzip it - again, anywhere is fine. It will download all 30 Census shapefiles. We’re only going to use the one called “cb_2021_us_state_500k.zip”. The rest you can delete, if you want.
- - - -When you unzip the cb_2021_us_state_500k.zip, it will contain four files. You’ll only ever work with the .shp
file, but the other three are used in the background to display the data.
Once all the files are unzipped, we can load the .shp
file into R.
1
-2
-3
-4
-5
-6
-7
-8
-9
-
## load a previously downloaded shapefile
- usa <- read_sf("shapefiles/original/usa/states/cb_2021_us_state_500k.shp") %>%
- filter(STATEFP < 57) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted shapefile
- st_write(usa, "path/to/file/usa.shp")
-
Everything except line 2 is the same as in method 1. I won’t go over lines 3-9 here, because all the information is above.
- -This line is very similar to the one above. I changed the name of the variable to usa
so I could keep both methods in the same R file (each R variable needs to be unique or it will be overwritten).
read_sf
is part of the sf() package. It’s used to load shapefiles into R. The path to the file is enclosed in quotation marks and parentheses. Simply navigate to wherever you unzipped the cb_2021_us_state_500k file and choose the file with the .shp
extension.
Once the shapefiles are downloaded - either using tigris() or by loading the shapefiles from the Census website - you can create the base map. I’ll tackle making the base map in part II of this series.*
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ac/f1d7a04d93d63e1cfd38f18f51f377993675ece83f0e5253bd794637fb9188 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ac/f1d7a04d93d63e1cfd38f18f51f377993675ece83f0e5253bd794637fb9188 deleted file mode 100644 index 886abda..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ac/f1d7a04d93d63e1cfd38f18f51f377993675ece83f0e5253bd794637fb9188 +++ /dev/null @@ -1,701 +0,0 @@ -I"śWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The basic logic of a for loop is:
- for(x in y){
- do something}
Inside the parenthesis is the condition that must evaluate to TRUE if the content in the curly braces is to run.
- -In line 4, for(name in all_names){
says as long as there’s a name in the list of all names, do whatever is inside the curly braces. name
can be whatever you want. It’s a placeholder value. I can have it say for(dogs in all_names){
it will still do the exact same thing. A lot of time you’ll see it as an i
for item. I like to use more descriptive language because, again, for loops are my Achilles’ heel.
The all_names
part is where ever you want R to look for the data. It will change based on your data set and variable naming conventions.
In line 5, I save the split data sets.
- -st_write()
is part of the sf package which allows us to create shapefiles. This can be any saving function (eg. write_csv() if you want to save CSVs). The function takes several arguments. In line 43 above I showed the basic structure: st_write(data, path/to/file.shp). This is good if you only have one file, but since I’m saving them in a loop I don’t want all of the files to have the same name. R will error out after the first and tell you the file already exists.
The first part split_states[[name]]
is still telling R what data to save, but using an index instead of a specific data frame name. To access an index you use data[[some-value]]
where some-value
is the index location. In my code, R will take the split_states
data and go alright the first index location in [[name]]
is 1 and return whatever value is stored in that index (here, AK). It will then do that for every index location as it loops through the split_states
data.
paste0()
is also part of base R - it’s apparently faster than paste()
. It concatenates (or links together) different pieces into one. I’m using it to create the filename. Within the paste0
call anything within quotation marks is static. So every file will be saved to "shapefiles/shifted/states/individual/"
and every file will have the extension .shp
. What will change with each loop is the name
of the file. One by one, R will loop through and save each file using the name
it pulled from all_names
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
<blockquote class = "blockquote">It made a difference to that one.
+ <footer class="blockquote-footer">The Star Thrower | <cite title="Source Title"><a href = "https://mrjakeparker.tumblr.com/post/87041680432/star-thrower-is-based-off-of-this-story-which-was">Mr. Jake Parker</a></cite> (my favorite version)</footer>
+ </blockquote>
+ <p>I have a simple motto in life: Do what you can, where you are, with what you have. As a result, I believe strongly in doing whatever is in my means to make the world a better place. Below are some groups that I have either founded or joined in order to help those around me.</p>
+ <h1>Current Groups</h1>
+ <hr class = "h-line">
+ <ul>
+ <li><i>Political Science Methodology Group</i> | Co-organizer with Melina Much <br/>
+ University of California, Irvine </li><br/>
+ <li><i>Political Science Womxn's Caucus</i> | Student leader <br/>
+ University of California, Irvine </li><br/>
+ <li><i>Political Science Workshop Coordinator</i> <br/>
+ University of California, Irvine </li><br/>
+ <li><i>Legal Politics Writing Workshop</i><br/>
+ University of California, Irvine </li><br/>
+ <li><i>Center for Democracy: Writing Workshop</i> | Member <br/>
+ University of California, Irvine</li><br/>
+ <li><i>UCI Humanities: Writing Workshop</i> | Member <br/>
+ University of California, Irvine</li><br/>
+ </ul>
+ <h1>Previous Groups</h1>
+ <hr class = "h-line">
+ <ul>
+ <li><i>Friends of the San Dimas Dog Park</i> | Ambassador <br/>
+ San Dimas, California </li><br/>
+ <li><i>Prisoner Education Project</i> | Volunteer <br/>
+ Pomona, California</li><br/>
+ <li><i>Tails of the City</i> | Volunteer Photographer <br/>
+ Los Angeles, California</li><br/>
+ <li><i>Philosophy Club</i> | President, Graphic Designer, and Banquet Chair <br/>
+ California State Polytechnic University, Pomona</li><br/>
+ <li><i><a href = "https://www.voteamerica.com/">Long Distance Voter</a></i> | Intern <br/>
+ Social Media Content Creator</li><br/>
+ <li><i><a href = "https://www.freepress.net/">Free Press</a></i> | Intern <br/>
+ Social Media Content Creator</li>
+ </ul>
+<!-- </div>
+
</div> + –>
+:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ae/aac6de25a3018957bfd207974447f44e3a40a693e6ae6ee571cc91f9458d20 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ae/aac6de25a3018957bfd207974447f44e3a40a693e6ae6ee571cc91f9458d20 deleted file mode 100644 index 118a437..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ae/aac6de25a3018957bfd207974447f44e3a40a693e6ae6ee571cc91f9458d20 +++ /dev/null @@ -1,290 +0,0 @@ -I"ˇKWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/b1/c273246ffff1cc2367ff0905edc798542314fcf0c6eb6c1d05695c4328ee51 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/b1/c273246ffff1cc2367ff0905edc798542314fcf0c6eb6c1d05695c4328ee51 new file mode 100644 index 0000000..8426c9d --- /dev/null +++ b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/b1/c273246ffff1cc2367ff0905edc798542314fcf0c6eb6c1d05695c4328ee51 @@ -0,0 +1,22 @@ +I"ń Here you'll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+
+ I have done my best to keep my views out of it.
+
+ I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+
+ If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page] <br><br>
+ <h3>helpful resources:</h3>
+ <ol><li>Register to vote (Deadline May 23): <a href="https://registertovote.ca.gov/">https://registertovote.ca.gov/</a> </li>
+ <li>Check your registration status: <a href="https://voterstatus.sos.ca.gov/">https://voterstatus.sos.ca.gov/</a></li>
+ <li>Access the official voter guide: <a href="https://voterguide.sos.ca.gov/">https://voterguide.sos.ca.gov/</a> </li>
+ <li>Early voting & ballot drop off locations: <a href="https://caearlyvoting.sos.ca.gov/">https://caearlyvoting.sos.ca.gov/</a></li>
+ <li>Track your ballot: <a href="https://california.ballottrax.net/voter/">https://california.ballottrax.net/voter/</a></li>
+ <li>If you are in the Los Angeles, San Bernardino, Orange County area and need help getting to your polling place I will either find you resources or help you get there. I also offer to go with you to vote (and I will bring my two large German Shepherds) if you feel unsafe going to vote alone.</li></ol>
+
<a href="https://github.com/liz-muehlmann/Election_Guides/raw/main/California/Primary%20Elections/National%20and%20State/2022%20-%20Primary%20-%20California.pdf" download="2022_Primary_CA.pdf">2022 Primary California</a> s
+
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/b5/5111852e32e243664db5e600e86a8b2fed27fdc7aa672860341969c1e934bd b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/b5/5111852e32e243664db5e600e86a8b2fed27fdc7aa672860341969c1e934bd deleted file mode 100644 index f17ed4c..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/b5/5111852e32e243664db5e600e86a8b2fed27fdc7aa672860341969c1e934bd +++ /dev/null @@ -1,622 +0,0 @@ -I"]éWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “hello world!”**
- -* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part five of my cartography in R series. In this post I’ll return to the maps created in part II and part III to include a Shiny information box and popups linking to posts about my adventures in the National Parks.
+ + + + + + | + + + +read more +Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/c2/96502ccbab1bca4bf4cf08a9653f7d70e029cf9cbc9e33fb1366f539bfe950 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/c2/96502ccbab1bca4bf4cf08a9653f7d70e029cf9cbc9e33fb1366f539bfe950 deleted file mode 100644 index 22dfcbd..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/c2/96502ccbab1bca4bf4cf08a9653f7d70e029cf9cbc9e33fb1366f539bfe950 +++ /dev/null @@ -1,645 +0,0 @@ -I"‘ńWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-6
-
split_states <- split(state_parks, f = state_parks$State_Nm) # split the data by state
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))
- }
-
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm) # split the data by state
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call.
Common filter operators include &
(and), |
(or), <
(less than), or </code>></code> (greater than).
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management and the Bureau of Reclamation. Having visited the park, I can tell you there’s no fences blocking these areas off. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area. It will be a good test case to make sure I’m selecting the correct data.
- - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/c4/6bed8a13a9c81f479eb4ef15428eb80d902b19ffe2ebc564cccfca34712f6f b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/c4/6bed8a13a9c81f479eb4ef15428eb80d902b19ffe2ebc564cccfca34712f6f deleted file mode 100644 index 6ca5aa5..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/c4/6bed8a13a9c81f479eb4ef15428eb80d902b19ffe2ebc564cccfca34712f6f +++ /dev/null @@ -1,617 +0,0 @@ -I"úćWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the US Map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save it, and in part VI of this never-ending series* I’ll create individual state maps and link them to the map of National Parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/c8/0cc3a134f4680bf2a260397c76eab4663c3b8059e8443254c3186567d18289 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/c8/0cc3a134f4680bf2a260397c76eab4663c3b8059e8443254c3186567d18289 deleted file mode 100644 index d613beb..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/c8/0cc3a134f4680bf2a260397c76eab4663c3b8059e8443254c3186567d18289 +++ /dev/null @@ -1,699 +0,0 @@ -I"ŮWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The basic logic of a for loop is:
- for(x in y){
- do something}
Inside the parenthesis is the condition that must evaluate to TRUE if the content in the curly braces is to run.
- -In line 4, for(name in all_names){
says as long as there’s a name in the list of all names, do whatever is inside the curly braces. name
can be whatever you want. It’s a placeholder value. I can have it say for(dogs in all_names){
it will still do the exact same thing. A lot of time you’ll see it as an i
for item. I like to use more descriptive language because, again, for loops are my Achilles’ heel.
The all_names
part is where ever you want R to look for the data. It will change based on your data set and variable naming conventions.
In line 5, I save the split data sets.
- -st_write()
is part of the sf package which allows us to create shapefiles. This can be any saving function (eg. write_csv() if you want to save CSVs). The function takes several arguments. In line 43 above I showed the basic structure: st_write(data, path/to/file.shp). This is good if you only have one file, but since I’m saving them in a loop I don’t want all of the files to have the same name. R will error out after the first and tell you the file already exists.
The first part split_states[[name]]
is still telling R what data to save, but using an index instead of a specific data frame name. To access an index you use data[[some-value]]
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/cd/5c4213badccd372ba4332e95112b9279ad1fd8f181545ac1bbdb8b6e89edd0 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/cd/5c4213badccd372ba4332e95112b9279ad1fd8f181545ac1bbdb8b6e89edd0 deleted file mode 100644 index c87c55d..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/cd/5c4213badccd372ba4332e95112b9279ad1fd8f181545ac1bbdb8b6e89edd0 +++ /dev/null @@ -1,248 +0,0 @@ -I"CWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- -The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management and the Bureau of Reclamation. Having visited the park, I can tell you there’s no fences blocking these areas off. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area. It will be a good test case to make sure I’m selecting the correct data.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ce/79712b1ddac46abbccafc0439b2dee5df051ecc72c5e7c907e3a4e96fb728c b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ce/79712b1ddac46abbccafc0439b2dee5df051ecc72c5e7c907e3a4e96fb728c new file mode 100644 index 0000000..c022024 --- /dev/null +++ b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ce/79712b1ddac46abbccafc0439b2dee5df051ecc72c5e7c907e3a4e96fb728c @@ -0,0 +1,22 @@ +I"ń Here you'll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+
+ I have done my best to keep my views out of it.
+
+ I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+
+ If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page] <br><br>
+ <h3>helpful resources:</h3>
+ <ol><li>Register to vote (Deadline May 23): <a href="https://registertovote.ca.gov/">https://registertovote.ca.gov/</a> </li>
+ <li>Check your registration status: <a href="https://voterstatus.sos.ca.gov/">https://voterstatus.sos.ca.gov/</a></li>
+ <li>Access the official voter guide: <a href="https://voterguide.sos.ca.gov/">https://voterguide.sos.ca.gov/</a> </li>
+ <li>Early voting & ballot drop off locations: <a href="https://caearlyvoting.sos.ca.gov/">https://caearlyvoting.sos.ca.gov/</a></li>
+ <li>Track your ballot: <a href="https://california.ballottrax.net/voter/">https://california.ballottrax.net/voter/</a></li>
+ <li>If you are in the Los Angeles, San Bernardino, Orange County area and need help getting to your polling place I will either find you resources or help you get there. I also offer to go with you to vote (and I will bring my two large German Shepherds) if you feel unsafe going to vote alone.</li></ol>
+
<a href="https://github.com/liz-muehlmann/Election_Guides/raw/main/California/Primary%20Elections/National%20and%20State/2022%20-%20Primary%20-%20California.pdf" download="2022_Primary_CA.pdf">2022 Primary California</a> s
+
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/d0/a6f77e1daa791ba9d7a5532cf976e5125033df09a5d9fbd4a6590fa7bbff30 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/d0/a6f77e1daa791ba9d7a5532cf976e5125033df09a5d9fbd4a6590fa7bbff30 new file mode 100644 index 0000000..e01cc83 --- /dev/null +++ b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/d0/a6f77e1daa791ba9d7a5532cf976e5125033df09a5d9fbd4a6590fa7bbff30 @@ -0,0 +1,46 @@ +I"wHere you’ll find several non-partisan ballot guides. I try to include as much information as possible without directly recreating the official voter guide. The information is sourced from the ballot guide, calmatters, ballotpedia, LA Times, voter’s edge, and Mercury News. I have included links to the campaign website or wherever most of the information came from.
+ +I have done my best to keep my views out of it.
+ +I started this to help my two aunts because they would ask me to simplify their ballots for them. Democracy relies on an informed and participatory citizenry, but it’s not always easy. This is meant to alleviate some of the burden.
+ +If you notice any errors, you feel like I’ve missed something, or you found this guide helpful feel free to send me an email [click the envelope at the bottom of the page]
+ +Primary Elections | +General Elections | +
---|---|
2022 Primary California | +Title | +
Paragraph | +Text | +
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R. It takes quite a few arguments, most of which are optional.
The first argument is the vector (or data frame) that you want to split into different groups. I want to split the state_parks
data into its corresponding states, so it is listed first.
The second argument f =
is how you want the data split. f
in this instance stands for factor. If we run levels(as.factor(state_parks$State_Nm))
in the terminal, it will return a list of the 50 state abbreviations. That is what we’re telling R to do here.
You can access an individual state using the $
operator. split_states$CA
will return the state park data for California.
names
is also part of base R. It does what it sounds like - it gets the names of an object. Here, I want to get the names of each split data sets.
Here’s the actual for loop.
- -The basic logic of a for loop is:
- for(x in y){
- do something}
Inside the parenthesis is the condition that must evaluate to TRUE if the content in the curly braces is to run.
- -In line 4, for(name in all_names){
says as long as there’s a name in the list of all names, do whatever is inside the curly braces. name
can be whatever you want. It’s a placeholder value. I can have it say for(dogs in all_names){
it will still do the exact same thing. A lot of time you’ll see it as an i
for item. I like to use more descriptive language because, again, for loops are my Achilles’ heel.
The all_names
part is where ever you want R to look for the data. It will change based on your data set and variable naming conventions.
In line 5, I save the split data sets.
- -st_write()
is part of the sf package which allows us to create shapefiles. This can be any saving function (eg. write_csv() if you want to save CSVs). The function takes several arguments. In line 43 above I showed the basic structure: st_write(data, path/to/file.shp). This is good if you only have one file, but since I’m saving them in a loop I don’t want all of the files to have the same name. R will error out after the first and tell you the file already exists.
The first part split_states[[name]]
is still telling R what data to save, but using an index instead of a specific data frame name. To access an index you use data[[some-value]]
where some-value
is the index location. In my code, R will take the split_states
data and go alright the first index location in [[name]]
is 1 and return whatever value is stored in that index (here, AK). It will then do that for every index location as it loops through the split_states
data.
paste0()
is also part of base R - it’s apparently faster than paste()
. It concatenates (or links together) different pieces into one. I’m using it to create the filename. Within the paste0
call anything within quotation marks is static. So every file will be saved to "shapefiles/shifted/states/individual/"
and every file will have the extension .shp
. What will change with each loop is the name
of the file. One by one, R will loop through and save each file using the name
it pulled from all_names
.
st_write()
automatically creates the other three files that each “shapefile” needs. When the loop is done, you should have a folder of 200 files (50 states * 4 files each). Which is why I strongly recommend using DVC if you’re doing any kind of version control.
That’s all the processing done for the state files… for now. In part VI I’ll return to the states to create each state’s own map. Next up, in part V, I’m going back to my base map with the National Parks to add in some informational tool tips and interactivity.
- -
-* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me an email or a tweet if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-
state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/d6/a1cb20e5bbdb0b76ecbeb575e9149cc4d888ee5a7816310179966208de5ab8 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/d6/a1cb20e5bbdb0b76ecbeb575e9149cc4d888ee5a7816310179966208de5ab8 new file mode 100644 index 0000000..6383c37 --- /dev/null +++ b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/d6/a1cb20e5bbdb0b76ecbeb575e9149cc4d888ee5a7816310179966208de5ab8 @@ -0,0 +1,180 @@ +I"Ë3Welcome to part five of my cartography in R series. In this post I’ll return to the maps created in part II and part III to include a Shiny information box and popups linking to posts about my adventures in the National Parks.
+ + + +I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
+ + +III. cartography in r part three
+IV. cartography in r part four
+Create a new file called app.r
which we’ll use to build in the Shiny functionality. Keep in mind Shiny requires the filename to be app
.
1
+2
+3
+4
+5
+6
+
library(tidyverse) # useful data manipulation tools
+ library(sf) # read and write shapefiles
+ library(tigris) # downloading shapefiles for Method 1
+ library(leaflet) # map creation
+ library(operator.tools) # not-in function
+ library(shiny) # interactivity
+
I am not going to explain in detail what the packages in lines 1-5 do because I already covered it in part one.
+ +In a previous map I made I used labels to create a pop up which contained information about the number of newspapers in each county. In that map, I was only interested in showing the population and number of newspapers.
+ + + +In this map I want to display more information but since National and State Parks are close together using pop ups created a lot of overlap and quickly became unreadable. I want to move a lot of the basic information about park name and its size to a box in the corner and only use the pop ups to display a small photo of the park that leads to a blog about my adventure in the park.
+ +1
+2
+3
+
## load data
+ usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
+ nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
+
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
The customization aspects of the map - using special colors, adding in Shiny functionality, etc - are all declared before the map widget call. That creates some difficulty in how to best present the code.
+ +First, I’ll give all the code then cover what’s different in each section. If you’re using the code we created in part III be mindful of where the new lines appear so you can avoid any errors.
+ +I like to use different colors on my maps to indicate different things. In the map above, the warmer colors indicate less newspapers and the cooler colors indicate more newspapers with green being areas with the most newspapers.
+ +In this map, I want to use colors that reflect the land’s designation types. I want the rivers to show up blue, the parks to be green, and other areas to be colored as I dictate.
+ +1
+2
+3
+4
+5
+6
+7
+8
+
nps_color <- colorFactor(c("#B2AC88", # national historical
+ "#F9F2CB", # international historical
+ "#99941A", # military
+ "#006C5F", # park
+ "#568762", # preserves and rec areas
+ "#31B8E6", # lakes and rivers
+ "#899B7C", # trails
+ "#AFAC99"), nps$type) # other
+
Line 1 creates a variable which I’ll use later in the leaflet call. It makes the leaflet call cleaner and less cluttered, though I imagine you could declare the colors directly in the addPolygons()
call.
colorFactor()
is part of the Leaflet package. It assigns colors to factors (categories) - here the factors are the types of National Public Lands. It takes several arguments which you can read about on the R website.
The first argument is the palette. This can be one of the palettes built into R Color Brewer or like here (the hex codes in lines 1-8), one that the user specifies.
+ +nps$type
is the domain of the data. It’s the categories that R will map the colors to. colorFactor
requires categorical data.
If you’re creating your own palette, the order you list the colors has to match the order of the domain. To easily check what order the domain is in run levels(as.factor(nps$type))
in the terminal. This will return a list of values which you can use to map the colors.
I include a comment of which colors will be mapped to which category so that I can easily change them if necessary.
+ + + +Shiny apps have three components:
+ +ui
) which controls the “layout and appearance” of the app.shinyApp
function creates the Shiny objects using the first two components.Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- -The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management and the Bureau of Reclamation. Having visited the park, I can tell you there’s no fences blocking these areas off. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area. It will be a good test case to make sure I’m selecting the correct data.
- - - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/da/3507f7fe69d048357c9cffcb9f966b5dc4e0c7f2e30eafc8361ceaa3f1c1f6 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/da/3507f7fe69d048357c9cffcb9f966b5dc4e0c7f2e30eafc8361ceaa3f1c1f6 deleted file mode 100644 index e060b5c..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/da/3507f7fe69d048357c9cffcb9f966b5dc4e0c7f2e30eafc8361ceaa3f1c1f6 +++ /dev/null @@ -1,303 +0,0 @@ -I"©RWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-
state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -1
-2
-3
-4
-5
-
split_states <- split(state_parks, f = state_parks$State_Nm)
- all_names <- names(split_states)
-
- for(name in all_names){
- st_write(split_states[[name]], paste0("shapefiles/shifted/states/individual/", name, '.shp'))}
-
Look ma, new code!
- -The split()
is part of base R.
* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
This is part three of my cartography in R series. If you are just finding this, I suggest taking a look at part I and part II first.
- -In this post, I will download and process the National Park data. Once that’s done, I’ll add it to the base map I created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-
## load data
- states <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
to reflect wherever you saved the shifted shapefile.
If your data processing and base map creation are in the same file, you can skip this line, and when you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -The National Park Service provides all the data we’ll need to make the map. The data is accessible on the ArcGIS’ Open Data website. Once you click on the link you’ll see a bunch of icons that lead to different data that’s available for download. Click on the one for boundaries.
- - - -From here, you’ll be taken to a list of available National Park data. The second link should be nps boundary which contains the shape data for all the National Parks in the United States. The file contains all the data for the park outlines along with hiking trails, rest areas, and lots of other data.
- - - -The nps boundary link will take you to a map showing the national parks. On the left, there will be a download link on the left.
- - - -From here, you’ll have a few download options. The National Park Service provides the data in different formats including CSV and Shapefile. You’ll want to download the shapefile version.
- - - -Be sure to save the file somewhere on your hard drive that is easy to find. When it finishes downloading, be sure to unzip the file. There will be four files inside the folder. All of them need to be kept in the same location. Even though we’ll only load the .shp
file, R uses the three others to create the necessary shapes.
The code below may look intimidating, but it’s fairly straight forward. I’ll go over each line below.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- ## load and process nps data
- nps <- read_sf("./shapefiles/original/nps/NPS_-_Land_Resources_Division_Boundary_and_Tract_Data_Service.shp") %>%
- select(STATE, UNIT_TYPE, PARKNAME, Shape__Are, geometry) %>%
- filter(STATE %!in% territories) %>%
- mutate(type = case_when(UNIT_TYPE == "International Historic Site" ~ "International Historic Site", # there's 23 types of national park, I wanted to reduce this number.
- UNIT_TYPE == "National Battlefield Site" ~ "National Military or Battlefield", # lines 56-77 reduce the number of park types
- UNIT_TYPE == "National Military Park" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Battlefield" ~ "National Military or Battlefield",
- UNIT_TYPE == "National Historical Park" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Site" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Historic Trail" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Memorial" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Monument" ~ "National Historical Park, Site, Monument, or Memorial",
- UNIT_TYPE == "National Preserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Reserve" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National Recreation Area" ~ "National Preserve, Reserve, or Recreation Area",
- UNIT_TYPE == "National River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Lakeshore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Wild & Scenic River" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Seashore" ~ "National River, Lakeshore, or Seashore",
- UNIT_TYPE == "National Trails Syste" ~ "National Trail",
- UNIT_TYPE == "National Scenic Trail" ~ "National Trail",
- UNIT_TYPE == "National Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Park" ~ "National Park or Parkway",
- UNIT_TYPE == "Parkway" ~ "National Park or Parkway",
- UNIT_TYPE == "Other Designation" ~ "Other National Land Area")) %>%
- mutate(visited = case_when(PARKNAME == "Joshua Tree" ~ "visited",
- PARKNAME == "Redwood" ~ "visited",
- PARKNAME == "Santa Monica Mountains" ~ "visited",
- PARKNAME == "Sequoia" ~ "visited",
- PARKNAME == "Kings Canyon" ~ "visited",
- PARKNAME == "Lewis and Clark" ~ "visited",
- PARKNAME == "Mount Rainier" ~ "visited",
- TRUE ~ "not visited")) %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save the shifted national park data
- st_write(nps, "~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
In part I of this series I talked about how R has an %in%
function, but not a %!in%
function. Here’s where the latter function shines.
The United States is still an empire with its associated territories and islands. In this project I am interested in the 50 states - without these other areas. As a result, I need to filter them out. Using base R’s %in%
function I would have to create a variable that contains the postal abbreviations for all 50 states. That is annoying. Instead, I want to use the shorter list that only includes the US’ associated islands and territories. To do so, however, I need to use the operator tools’ %!in%
function.
Line 2 creates the list of US territories that I filter out in line 7. The c()
function in R means combine or concatenate. Inside the parenthesis are the five postal codes for the American Samoa, Guam, the Northern Mariana Islands, Puerto Rico, and the Virgin Islands.
nps <- read_sf("path/to/file.shp")
loads the National Park data set to a variable called nps
using the read_sf()
function that is part of the sf package. You will need to change the file path so it reflects where you saved the data on your hard drive.
The %>%
operator is part of the tidyverse package. It tells R to go to the next line and process the next command. It has to go at the end of a line, rather than the beginning.
select
is part of the tidyverse package. With it, we can select columns by their name rather than their associated number. Large data sets take more computing power because the computer has to iterate over more rows. Unfortunately, rendering maps also takes a lot of computing power so I like to discard any unnecessary columns to reduce the amount of effort my computer has to exert.
Deciding on which columns to keep will depend on the data you’re using and what you want to map (or analyze). I know for my project I want to include a few things:
-There’s a couple ways to inspect the data to see what kind of information is available.
- -view(nps)
but as the number of data points increases, so does R’s struggle with opening it. I’ve found that VSCode doesn’t throw as big of a fit as R Studio when opening large data sets.data.frame(colnames(nps))
. This will return a list of the data set’s column names. This is my preferred method. I then go to the documentation to see what each column contains. This isn’t fool-proof because it really depends on if the data has good documentation.The National Park data includes a lot of information about who created the data and maintains the property. I’m not interested in this, so in line 6 I select the following columns:
-The geometry column is specific to shapefiles and it includes the coordinates of the shape. It will be kept automatically - unless you use the st_drop_geometry()
function. I like to specifically select so I remember it’s there.
In line 7 I use the territories list I created in line 2 to filter out the United States’ associated areas. Since the nps data uses the two character state abbreviation, I have to use the two character abbreviation for the territories. Searching for “Guam,” for example, won’t work.
- -filter()
is part of the tidyverse and it uses conditional language. In the parentheses is a condition that must be true if the tidyverse is going to keep the row. Starting at the top of the data, R goes “alright, does the value in the STATE column match any of the values in the territories list?” If the condition is TRUE, R adds the row to the new data frame.
%!in%
operator, any row that evaluates as TRUE will be kept because the value is NOT found in the territories list. If I wanted to keep only the territories, I would use the %in%
operator and only the rows with STATE abbreviations found in the territories list would be kept. For example, if the STATE value in row 1 is CA, filter looks at it and goes “is CA NOT IN territories?” If that is TRUE, keep it because we want only the values that are NOT IN the territories list.
- -mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
The NPS data set has 23 different types of National Parks listed (you can view all of them by running levels(as.factor(nps$UNIT_TYPE))
). I know that in later posts, I’m going to color code the land by type (blue for rivers, green for national parks, etc) so I wanted to reduce the number of colors I would have to use.
mutate()
’s first argument, type =
creates a new column called type
. R will populate the newly created column with whatever comes after the first (singular) equal =
sign. For example, I can put type = NA
and every row in the column will say NA
.
Here, I am using the case_when()
function, which is also part of the tidyverse. The logic of case_when
is fairly straight forward. The first value is the name of the column you want R to look in (here: UNIT_TYPE
). Next, is a conditional. Here I am looking for an exact match (==
) to the string (words) inside the first set of quotation marks (in line 8: "International Historic Site"
). The last part of the argument is what I want R to put in the type
column when it finds a row where the UNIT_TYPE
is "International Historic Site"
.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
Lines 9-29 do the same thing for the other park types. You can reduce the parks however you want or use all 23 types. Just remember that the value before the tilde ~
has to match the values found in the data exactly. For example, in line 24 I change the NPS data’s National Trail Syste value to be National Trail. Whomever created the data set did not spell system correctly, so for R to match the value I also have to omit the last letter in system.
Lines 30-37 use the same mutate()
and case_when
logic as above. Instead of reducing the number of park types, I use it to mark the different parks I have visited.
Line 30 creates the new column, visited
and uses case_when
to look for the names of the parks that I’ve been to. If I have visited them, it adds visited
to the column of the same name.
The last line, TRUE ~ "not_visited))
, acts as an else statement. For any park not listed above, it will put not visited
in the visited
column I created.
This feels like a very brute-force method of tracking which parks I’ve visited, but I haven’t spend much time trying to find another way.
- -In part I, when I made the base map, I moved Alaska and Hawaii so they were of similar size and closer to the continental USA. For the map to display the parks correctly, I have to shift them as well.
- -I went over these two lines in part II, so I won’t go over them again here. If you want to read more about them, check out that post.
- -The last line uses the st_transform()
function from the sf package to covert the data set from NAD83 to WGS84. Leaflet requires WGS84, so be sure to include this line at the end of your data manipulation.
I covered the WGS84 ellipsoid in part I, if you want to read more about it.
- -Strictly speaking, this line isn’t necessary. You can do all your data processing in the same file where you make your map, but I prefer to separate the steps into different files.
- -As a result, I save the shifted data to my hard drive so it’s easier to load later. I usually have this line commented out (by placing #
at the start of the line) after I save it the first time. I don’t want it to save every time I run the rest of the code.
1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-
## create usa Base Map using leaflet()
- map <- leaflet() %>%
- addPolygons(data = states,
- smoothFactor = 0.2,
- fillColor = "#808080",
- fillOpacity = 0.5,
- stroke = TRUE,
- weight = 0.5,
- opacity = 0.5,
- color = "#808080",
- highlight = highlightOptions(
- weight = 0.5,
- color = "#000000",
- fillOpacity = 0.7,
- bringToFront = FALSE),
- group = "Base Map") %>%
- addPolygons(data = nps,
- smoothFactor = 0.2,
- fillColor = "#354f52",
- fillOpacity = 1,
- stroke = TRUE,
- weight = 1,
- opacity = 0.5,
- color = "#354f52",
- highlight = highlightOptions(
- weight = 3,
- color = "#fff",
- fillOpacity = 0.8,
- bringToFront = TRUE),
- group = "National Parks") %>%
- addLayersControl(
- baseGroups = "Base Map",
- overlayGroups = "National Parks",
- options = layersControlOptions(collapsed = FALSE))
-
Lines 2-16 are identical to those in part II where I created the base map. I am not going to cover these sections in detail, because I covered it previously.
- -To add the National Park data to the base map, we call addPolygons()
again. The arguments are the same as before - color, opacity, outline style - just with different values. By changing those values, we can differentiate the base map from the national park data.
Since we’re mapping the National Parks and not the states, we have to tell R where the data is located using data = nps
.
smoothFactor()
determines how detailed the park boundaries should be. The lower the number, the more detailed the shape. The higher the number, the smoother the parks will render. I usually match this to whatever I set for the base map for consistency.
Define the color and transparency of the National Parks. In a future post, I am going to change the color of each type of public land, but for now, I’ll make them all a nice sage green color #354f52
. I also want to make the parks to be fully opaque.
The next four lines (21-24) define what kind of outline the National Parks will have. I detail each of these arguments in part II of this series.
- -Briefly, I want there to be an outline to each park (stroke = TRUE
) that’s thicker weight = 1
than the outline used on the base map. I do not like the way it looks at full opacity, so I make it half-transparent (opacity = 0.5
). Finally, I want the outline color = "#354f52
to be the same color as the fill. This will matter more when I change the fill color of the parks later on.
Lines 25-28 define the National Park’s behavior on mouseover. First we have to define and initialize the highlightOptions()
function. The function take similar arguments as the addPolygons
function - both of which I go over in detail in part II.
I want to keep the mouseover behavior noticeable, but simple. To do so, I set the outline’s thickness to be weight = 3
. This will give the shape a nice border that differentiates it from the rest of the map.
color = "#fff
sets the outline’s color on mouseover only. So, when inactive, the outline color will match the fill color, but on mouseover the outline color switches to white (#fff
).
bringToFront
can either be TRUE
or FALSE
. If TRUE
, Leaflet will bring the park to the forefront on mouseover. This is useful later when we add in the state parks because national and state parks tend to be close together.
When FALSE
the shape will remain static.
Since Leaflet adds all new data to the top of the base map, I think it’s useful to group the layers together. In the next block of code, we add in some layer functionality. For now, though, I want to add the National Parks to their own group so I can hide the National Parks if I want.
- -addLayersControl
defines how layers are displayed on the final map. The function takes three arguments.
First, we have to tell Leaflet which layer should be used as the base map: baseGroups = "Base Map"
. The name in the quotations (here: "Base Map"
) has to match the name given to the layer you set in the addPolygons()
call. In line 14, I put the 50 states into a group called "Base Map"
, but you can name it anything you like.
There can be more than one base map, too. It’s not super helpful here since I shifted Alaska and Hawaii, but when using map tiles you can add multiple types of base maps that users can switch between.
- -Next, we have to define the layers that are shown on top of the base group: overlayGroups = "National Parks"
. Just like the base map, this is defined in the corresponding addPolygons
call. Here, I called the layer National Parks
in line 30.
Finally, on the map I don’t want the layers to be collapsed, so I set options = layersControlOptions(collapsed = FALSE)
. When TRUE
the map will display an icon in the top right that, when clicked, will show the available layers.
Hey, look at that! You made a base map and you added some National Park data to it. You’re a certified cartographer now!
- -In the next part IV post we’ll download and process the state park data before adding it to the map. Part V of this series we’ll add Shiny functionality and some additional markers.
- - -</figure>
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/dc/2a65920806ec1e45e9b5614492fea639f7bde969e20899ba9b31c6121b7f84 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/dc/2a65920806ec1e45e9b5614492fea639f7bde969e20899ba9b31c6121b7f84 deleted file mode 100644 index 9ceb33b..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/dc/2a65920806ec1e45e9b5614492fea639f7bde969e20899ba9b31c6121b7f84 +++ /dev/null @@ -1,355 +0,0 @@ -I"«eWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is Stat. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/dd/c89e4885b12cef33557c5d7a6706d9c1be85d9863eba423643654c41604804 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/dd/c89e4885b12cef33557c5d7a6706d9c1be85d9863eba423643654c41604804 deleted file mode 100644 index 7d3d70e..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/dd/c89e4885b12cef33557c5d7a6706d9c1be85d9863eba423643654c41604804 +++ /dev/null @@ -1,7 +0,0 @@ -I"` -1
-2
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
It made a difference to that one. + ++ +
I have a simple motto in life: Do what you can, where you are, with what you have. As a result, I believe strongly in doing whatever is in my means to make the world a better place. Below are some groups that I have either founded or joined in order to help those around me.</p>
+ +Political Science Methodology Group | Co-organizer with Melina Much
+University of California, Irvine
Political Science Womxn’s Caucus | Student leader
+University of California, Irvine
Political Science Workshop Coordinator
+University of California, Irvine
Legal Politics Writing Workshop
+University of California, Irvine
Center for Democracy: Writing Workshop | Member
+University of California, Irvine
UCI Humanities: Writing Workshop | Member
+University of California, Irvine
<h1>Previous Groups</h1>
+<hr class = "h-line">
+<ul>
+ <li><i>Friends of the San Dimas Dog Park</i> | Ambassador <br/>
+ San Dimas, California </li><br/>
+ <li><i>Prisoner Education Project</i> | Volunteer <br/>
+ Pomona, California</li><br/>
+ <li><i>Tails of the City</i> | Volunteer Photographer <br/>
+ Los Angeles, California</li><br/>
+ <li><i>Philosophy Club</i> | President, Graphic Designer, and Banquet Chair <br/>
+ California State Polytechnic University, Pomona</li><br/>
+ <li><i><a href = "https://www.voteamerica.com/">Long Distance Voter</a></i> | Intern <br/>
+ Social Media Content Creator</li><br/>
+ <li><i><a href = "https://www.freepress.net/">Free Press</a></i> | Intern <br/>
+ Social Media Content Creator</li>
+</ul> <!-- </div>
+
</div> + –>
+:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/e1/0320cc69ef23d825da6b357bcb29829b0569da7716ee336cb6ae2e24f8c902 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/e1/0320cc69ef23d825da6b357bcb29829b0569da7716ee336cb6ae2e24f8c902 deleted file mode 100644 index 0e56673..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/e1/0320cc69ef23d825da6b357bcb29829b0569da7716ee336cb6ae2e24f8c902 +++ /dev/null @@ -1,624 +0,0 @@ -I"séWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something,” but they are my nemesis. Every time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “hello world!”**
- -print("Hello World!")
It made a difference to that one. + ++
<p>I have a simple motto in life: Do what you can, where you are, with what you have. As a result, I believe strongly in doing whatever is in my means to make the world a better place. Below are some groups that I have either founded or joined in order to help those around me.</p>
+ <h1>Current Groups</h1>
+ <hr class = "h-line">
+ <ul>
+ <li><i>Political Science Methodology Group</i> | Co-organizer with Melina Much <br/>
+ University of California, Irvine </li><br/>
+ <li><i>Political Science Womxn's Caucus</i> | Student leader <br/>
+ University of California, Irvine </li><br/>
+ <li><i>Political Science Workshop Coordinator</i> <br/>
+ University of California, Irvine </li><br/>
+ <li><i>Legal Politics Writing Workshop</i><br/>
+ University of California, Irvine </li><br/>
+ <li><i>Center for Democracy: Writing Workshop</i> | Member <br/>
+ University of California, Irvine</li><br/>
+ <li><i>UCI Humanities: Writing Workshop</i> | Member <br/>
+ University of California, Irvine</li><br/>
+ </ul>
+ <h1>Previous Groups</h1>
+ <hr class = "h-line">
+ <ul>
+ <li><i>Friends of the San Dimas Dog Park</i> | Ambassador <br/>
+ San Dimas, California </li><br/>
+ <li><i>Prisoner Education Project</i> | Volunteer <br/>
+ Pomona, California</li><br/>
+ <li><i>Tails of the City</i> | Volunteer Photographer <br/>
+ Los Angeles, California</li><br/>
+ <li><i>Philosophy Club</i> | President, Graphic Designer, and Banquet Chair <br/>
+ California State Polytechnic University, Pomona</li><br/>
+ <li><i><a href = "https://www.voteamerica.com/">Long Distance Voter</a></i> | Intern <br/>
+ Social Media Content Creator</li><br/>
+ <li><i><a href = "https://www.freepress.net/">Free Press</a></i> | Intern <br/>
+ Social Media Content Creator</li>
+ </ul>
+<!-- </div>
+
</div> + –>
+:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/e6/41e43138169733738be4bdd182fe68fa4abfd8d4409cda2af526a94156c186 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/e6/41e43138169733738be4bdd182fe68fa4abfd8d4409cda2af526a94156c186 deleted file mode 100644 index 4007fc4..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/e6/41e43138169733738be4bdd182fe68fa4abfd8d4409cda2af526a94156c186 +++ /dev/null @@ -1,604 +0,0 @@ -I"CâWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
It made a difference to that one. + ++ +
I have a simple motto in life: Do what you can, where you are, with what you have. As a result, I believe strongly in doing whatever is in my means to make the world a better place. Below are some groups that I have either founded or joined in order to help those around me.</p>
+ +UCI Humanities: Writing Workshop | Member
+
University of California, Irvine
<h1>Previous Groups</h1>
+<hr class = "h-line">
+<ul>
+ <li><i>Friends of the San Dimas Dog Park</i> | Ambassador <br/>
+ San Dimas, California </li><br/>
+ <li><i>Prisoner Education Project</i> | Volunteer <br/>
+ Pomona, California</li><br/>
+ <li><i>Tails of the City</i> | Volunteer Photographer <br/>
+ Los Angeles, California</li><br/>
+ <li><i>Philosophy Club</i> | President, Graphic Designer, and Banquet Chair <br/>
+ California State Polytechnic University, Pomona</li><br/>
+ <li><i><a href = "https://www.voteamerica.com/">Long Distance Voter</a></i> | Intern <br/>
+ Social Media Content Creator</li><br/>
+ <li><i><a href = "https://www.freepress.net/">Free Press</a></i> | Intern <br/>
+ Social Media Content Creator</li>
+</ul> <!-- </div>
+
</div> + –>
+:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ed/f3ca0c1065299402acb160458fe40acd5ac1b5b65cd02bbec44cd448e7db7c b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ed/f3ca0c1065299402acb160458fe40acd5ac1b5b65cd02bbec44cd448e7db7c deleted file mode 100644 index 23b91b8..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/ed/f3ca0c1065299402acb160458fe40acd5ac1b5b65cd02bbec44cd448e7db7c +++ /dev/null @@ -1,617 +0,0 @@ -I"üćWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save it, and in part VI of this never-ending series* I’ll create individual state maps and link them to the map of National Parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
-:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f1/53988ae2a401cf27c8bcf9a5bb4b2fa67a6dc4e1724c707f45b61b01f8323d b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f1/53988ae2a401cf27c8bcf9a5bb4b2fa67a6dc4e1724c707f45b61b01f8323d deleted file mode 100644 index e76d786..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f1/53988ae2a401cf27c8bcf9a5bb4b2fa67a6dc4e1724c707f45b61b01f8323d +++ /dev/null @@ -1,285 +0,0 @@ -I"ßIWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f3/b968f2da9f2e228b0f2f61ad11e95ff4a96a2429de956a54c231a0c12af28e b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f3/b968f2da9f2e228b0f2f61ad11e95ff4a96a2429de956a54c231a0c12af28e deleted file mode 100644 index 2a3f917..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f3/b968f2da9f2e228b0f2f61ad11e95ff4a96a2429de956a54c231a0c12af28e +++ /dev/null @@ -1,454 +0,0 @@ -I"™ŤWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines
- - - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f6/39707a350f09783dc957f58819d951ed69368412a34905946b94a4d58e86e6 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f6/39707a350f09783dc957f58819d951ed69368412a34905946b94a4d58e86e6 deleted file mode 100644 index 752aed3..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f6/39707a350f09783dc957f58819d951ed69368412a34905946b94a4d58e86e6 +++ /dev/null @@ -1,305 +0,0 @@ -I"l\This is a continuation of my previous post where I walked through how to download and modify shape data. I also showed how to shift Alaska and Hawaii so they are closer to the continental usa. -
- -In this post, I’ll go over how to use Leaflet to map the shapefile we made in the previous post. If you’ve come here from part one of the series, you probably have the libraries and data loaded already. However, if you don’t, be sure to load the libraries and shapefiles before moving to number two.
- -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-
## load data
- states <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
to reflect wherever you saved the shifted shapefile.
If your data processing and base map creation are in the same file, you can skip this line, and when you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -At its most basic, all Leaflet needs to create a map is a base map and data layers. The code below may look intimidating, but it’s mostly style options.
- -This is the map we’re going to create. It’s a simple grey map and each state darkens in color as you hover over it. I’ll show the same map after each style option is added so you can see what effect it has.
- - - -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-
## create usa base map using leaflet()
- map <- leaflet() %>%
- addPolygons(data = states,
- smoothFactor = 0.2,
- fillColor = "#808080",
- fillOpacity = 0.5,
- stroke = TRUE,
- weight = 0.5,
- opacity = 0.5,
- color = "#808080",
- highlight = highlightOptions(
- weight = 0.5,
- color = "#000000",
- fillOpacity = 0.7,
- bringToFront = FALSE),
- group = "Base Map")
-
leaflet()
initializes the map widget. I save it to a variable called map (map <-
) so I can run other code in the file without recreating the map each time. When you want to see the map, you can type map
(or whatever you want to name your map) in the terminal and hit enter. R will display the map in the viewer.
addPolygons()
adds a layer to the map widget. Leaflet has different layer options, including addTiles
and addMarkers
which do different things. You can read about them on the leaflet website. Since we’re using a previously created shapefile, we’ll add the shapefile to the map using addPolygons()
.
The first argument you need to specify after calling addPolygons is data = [data-source]
. [data-source]
is whatever variable your data is stored in. For me, it’s called states
. This is either the processed data from part I of this series or the saved shapefile loaded above under the section called load data.
When you run only the first two lines, Leaflet will use its default styling. The base color will be a light blue and the outlines of the states will be dark blue and fairly thick.
- - - -You can leave the base map like this if you want, but all additional data will be added as a layer on top</i>* of this map which can become distracting very quickly. I prefer to make my base maps as basic and unobtrusive as possible so the data I add on top of the base map is more prominent.
- -smoothFactor
controls how much the polygon shape should be smoothed at each zoom level. The lower the number the more accurate your shapes will be. A larger number, on the other hand, will lead to better performance, but can distort the shapes of known areas.
I keep the smoothFactor
low because I want the United States to appear as a coherent land mass. The image below shows three different maps, each with a different smoothFactor to illustrate what this argument does. On the left, the map’s smoothFactor=0.2
, the center map’s smoothFactor=10
, and the right’s smoothFactor=100
.
As you can see, the higher the smoothFactor
the less coherent the United States becomes.
addPolygons()
.
-fillColor
refers to what color is on the inside of the polygons. Since I want a minimal base map, I usually set this value to be some shade of grey. If you want a different color, you only need to replace #808080
with the corresponding hex code for the color you want. Here is a useful hex color picker. If you have a hex value and you want the same color in a different shade, this is a useful site.
fillOpacity
determines how transparent the color inside the shape should be. I set mine to be 0.5
because I like the way it looks. The number can be between 0 and 1 with 1 being fully opaque and 0 being fully transparent.
The next four lines define the appearance of the shapes’ outline.
- -The stroke
property can be set to either TRUE
or FALSE
. When true, Leaflet adds an outline around each polygon. When false, the polygons have no outline. In the image below, the map on the left has the default outlines and on the right stroke = FALSE
.
weight = 0.5
sets the thickness of the outlines to be 0.5 pixels. This can be any value you want with higher numbers corresponding to thicker lines. Lower numbers correspond to thinner lines.
The opacity
property operates in the same way as fill opacity above, but on the outlines. The number can be between 0 and 1. Lower numbers correspond to the lines being more transparent and 1 means fully opaque.
color = "#808080"
sets the color of the outline. I typically set it to be the same color as the fill color.
If you want a static base map then lines 2-10 are all you need, as shown in the image below. I like to add some functionality to my base map so that the individual states become darker when they’re hovered over.
- - - -Lines 11-15 define the map’s behavior when the mouse hovers over the shape. Most of the options are the same as the ones used on the base polygon shapes, so I won’t go into them with much detail.
- -highlight = highlightOptions()
contains the mouseover specifications. The word before the equal sign has to be either highlight
or highlightOptions
. I am not sure why you have to declare highlight twice, but you do.
highlightOptions()
is the actual function call.
weight
, color
, and fillOpacity
all operate in the same way as before, but whatever values you specify here will only show up when the mouse hovers over.
bringToFront
takes one of two values: TRUE
or FALSE
. It only really matters when you have multiple layers (like we will in later parts of this series). When bringToFront = TRUE
hovering over the state will bring it to the front. When bringToFront = FALSE
it will stay in the back.
Since the base map has only one layer, this property doesn’t affect anything.
- -group = "Base Map")
lets you group multiple layers together. This argument will come in handy as we add more information to the map. The base map is the default layer and is always visible - though, when you use map tiles you can define multiple base layers. All other layers will be on top of the base layer. When using different groups, you can define functionality that allows users to turn off certain layers.
You’ve created your first base map! It’s a boring flat, grey map, but it’s the base we’ll use when adding in the national and state park data. In part III of this series we’ll process and add in the National Parks.
- - -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f6/c8ec1b66f423ee8286319faf5ef6b916a8b62f8640e8c82b1975f8009b7062 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f6/c8ec1b66f423ee8286319faf5ef6b916a8b62f8640e8c82b1975f8009b7062 deleted file mode 100644 index 715b456..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/f6/c8ec1b66f423ee8286319faf5ef6b916a8b62f8640e8c82b1975f8009b7062 +++ /dev/null @@ -1,622 +0,0 @@ -I"céWelcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-16
-17
-18
-19
-20
-21
-22
-23
-24
-25
-26
-27
-28
-29
-30
-31
-32
-33
-34
-35
-36
-37
-38
-39
-40
-41
-42
-43
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW") %>%
- filter(d_Pub_Acce != "Closed" &
- d_Pub_Acce != "Unknown") %>%
- filter(Loc_Ds != "ACC" &
- Loc_Ds != "Hunter Access",
- Loc_Ds != "Public Boat Ramp") %>%
- select(d_Own_Type, d_Des_Tp, Loc_Ds, Unit_Nm, State_Nm, d_State_Nm, GIS_Acres) %>%
- mutate(type = case_when(d_Des_Tp == "Access Area" ~ "State Trail",
- d_Des_Tp == "Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "State Historic or Cultural Area" ~ "State Historical Park, Site, Monument, or Memorial",
- d_Des_Tp == "Recreation Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Resource Management Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Wilderness" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Recreation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Conservation Area" ~ "State Preserve, Reserve, or Recreation Area",
- d_Des_Tp == "State Park" ~ "State Park or Parkway")) %>%
- mutate(visited = case_when(Unit_Nm == "Valley of Fire State Park" ~ "visited",
- Unit_Nm == "Crissey Field State Recreation Site" ~ "visited",
- Unit_Nm == "Salton Sea" ~ "visited",
- Unit_Nm == "Anza-Borrego Desert State Park" ~ "visited",
- Unit_Nm == "Jedediah Smith Redwoods State Park" ~ "visited",
- Unit_Nm == "Del Norte Coast Redwoods State Park" ~ "visited",
- TRUE ~ "not visited") %>%
- shift_geometry(preserve_area = FALSE,
- position = "below") %>%
- sf::st_transform("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
-
- ## save shifted park data
- # st_write(state_parks, "./shapefiles/shifted/states/state_parks.shp")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Theoretically, lines 7-15 can be included with the first filter()
call in line 5, but I couldn’t get it to work.
Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-This will leave me with 50,102 rows.
- -nrow(state_parks)
-Yet another filter()
call. These two lines tell R to exclude any row whose d_Pub_Acce is not Closed or Unknown
The data has four types of access: Closed, Unknown, Open Access, and Restricted Access. I’m only interested in land that I can visit, so I want to keep only the parks with Open or Restricted Access. In the filter()
call, I chose to use !=
solely because months or years from now when I look at this code it will be easier for me to figure out what I was doing. I know myself and if I saw d_Pub_Acce == "Open Access"
my first thought would be: “What are the other types?” and then I’ll try and find out and waste a bunch of time.
This last filter brings the total number of state parks down to 49,719. I don’t think I can reduce that number more without removing places that should be kept.
- -*lines 18-20
- - - -Lines 18-20 have the same logic as lines 16-17 except here I want to filter out the Hunter Access areas and Boat Ramps.
- -Now that I’ve pared down the data a little bit, I want discard any column don’t need.
- -select()
lets me choose the columns I want to keep by name, rather than by index number.
I decided to keep:
-mutate()
is part of the tidyverse package and it’s extremely versatile. It is mainly used to create new variables or modify existing ones.
I wanted the state park designations to match closely with the types I used in the National Park data.
- -I went over the logic of using mutate()
and case_when()
in Part III of this series, so I won’t cover it again here.
In its general form, the format is case_when(COLUMN_NAME == "original_value" ~ "new_value")
. I only needed to change the values for "Recreation Management Area
s, the rest I just populated the new column with the old values.
Here is where I ran into some issues. In part III of the series when I processed the National Park data I included a mutate()
and case_when()
call to mark whether I’ve visited the park or not. It’s not a very elegant solution since I have to modify each park individually, but it was passable since I’ve only been to a handful of National Parks. For the state parks, though, it is unwieldy.
I had original wanted to drop the geometry and download the parks as a CSV, but even that was overwhelming.
- -In the end, I decided to focus on the parks that I know I’ve visited and have taken photos at. I’ve visited many, many state parks, but until I have the photos to add to the markers (covered in part five), I’m omitting them from this code. Hopefully in the mean time I’ll figure out a better way to keep track of the parks I’ve been to.
- -The logic is the same as the National Park data. mutate()
created a new column type
and populated it by using case_when()
.
I’ve covered these lines extensively in part II and part III of this series.
- -Lines 38-39 shift the state park data from Alaska and Hawaii so it appears under the continental US and of comparable size.
- -Line 40 is required to change the coordinate system from Albers to WGS84 - the latter of which is required by Leaflet.
- -Line 43 saves the shifted shapefile to the hard drive. Delete the #
from the start of the line to save the file.
I tried to map the base map, National Parks, and the state parks. It did not go well. R froze, my computer screamed, and chaos ensued. As a result, I had to rethink my map. I decided to separate the state parks by state, save them, and in part VI of this never-ending series* I’ll create individual state maps. When you click on a state it’ll take you to a map that includes the state parks.
- -Unfortunately, this also means I need to separate the National Parks by state so they also appear on the individual maps. The logic will be the same so I am not going to update part III to reflect that change. If you want to see that code it’s available on the project repo].
- -I don’t want to manually separate and save each state, so I’m going to use a loop! I hate loops. The logic is simple enough “as long as condition X is true, do something.” So simple, yet esvery time I’ve tried to learn a programming language I have struggled with loops. That’s pretty sad considering it’s like day 2 of any programming class. Day 1 is learning how to write “Hello World!”**
- -* I have annoyed myself with how long this series is. Hopefully it is helpful. Drop me a line if it is.
-** print("Hello World!")
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/fc/a1e49b7030b0eeb6fc84e0634cc1b76a3f1fee7713a3d212618c8994dbc925 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/fc/a1e49b7030b0eeb6fc84e0634cc1b76a3f1fee7713a3d212618c8994dbc925 new file mode 100644 index 0000000..96e2724 --- /dev/null +++ b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/fc/a1e49b7030b0eeb6fc84e0634cc1b76a3f1fee7713a3d212618c8994dbc925 @@ -0,0 +1,53 @@ +I"ł +It made a difference to that one. + ++ +
I have a simple motto in life: Do what you can, where you are, with what you have. As a result, I believe strongly in doing whatever is in my means to make the world a better place. Below are some groups that I have either founded or joined in order to help those around me.</p>
+ +</div> + –>
+:ET \ No newline at end of file diff --git a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/fd/7b44ab61e2ae44ac2a6c69d0e4981eec0860d7b2bc00536015e43f1af4b1f4 b/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/fd/7b44ab61e2ae44ac2a6c69d0e4981eec0860d7b2bc00536015e43f1af4b1f4 deleted file mode 100644 index 1dbdd57..0000000 --- a/.jekyll-cache/Jekyll/Cache/Jekyll--Converters--Markdown/fd/7b44ab61e2ae44ac2a6c69d0e4981eec0860d7b2bc00536015e43f1af4b1f4 +++ /dev/null @@ -1,333 +0,0 @@ -I"\Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories) %>%
- filter(Own_Type == "State")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 7.
- -st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
Welcome to part four of my cartography in R series. In this post, we’ll download and process the state park data before adding it to the base map created in part II.
- - - -I had to break the tutorial into different parts because it became unwieldy. I list the component parts below. The annotated version of the code can be found in this project’s repository in the folder called r files
- - -III. cartography in r part three
-IV. cartography in r part four [this post]
-1
-2
-3
-4
-5
-
## load libraries
- library("tidyverse") # data manipulation & map creation
- library("sf") # loads shapefile
- library("leaflet") # creates the map
- library("operator.tools") # not-in function
-
I am not going to explain in detail what each of these packages do because I already covered it in part one.
- -1
-2
-3
-
## load data
- usa <- read_sf("~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp")
- nps <- read_sf("~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp")
-
Be sure to change ~/Documents/Github/nps/shapefiles/shifted/usa/usa.shp
and ~/Documents/GitHub/nps/shapefiles/shifted/nps/nps.shp
to reflect where you saved the shifted shapefiles.
If your data processing and base map creation are in the same file, you can skip these lines. When you make the Leaflet call below, you’ll use the name of the variable where the shape data is stored.
- -Unlike the National Park data, the state data is harder to come by. After quite a bit of searching I found the Protected Areas Database of the United States (PAD-US) created by the United States Geological Survey. The data they collect is amazing. It is a “national inventory of U.S. terrestrial and marine protected areas.”
- -When they say national inventory, they mean it. It includes a detailed accounting of nearly every piece of public land “preserved through legal or other effective means” in the United States. For my project, it’s overkill and it will take some investigating to determine what information I want to keep and what to discard.
- -The first step is to navigate to the PAD-US website and download the data. On the PAD-US website, click View Data Download.
- - - -On the next page, you can download the National PAD-US data, the data by Census region, and by state. Click on National Geopackage. Make sure you select the geopackage one rather than the geodatabase version. On the next page, you’ll confirm you’re not a robot before the download link appears.
- - - -When it downloads, unzip it somewhere on your hard drive. The zip folder contains 35 files, so I suggest you create a folder before unzipping it so they stay together.
- -Since the geopackage is so large, R has a difficult time opening and displaying the data. There’s a couple of ways to view the data that does not rely on R. First, is to load the geopackage layers. This doesn’t load the complete data set. Instead, it just loads the layer names to use in conjunction with the documentation and the PAD-US Viewer.
- -1
-
layers <- st_layers("./shapefiles/original/state parks/padus_national_gpkg/PADUS3_0Geopackage.gpkg")
-
I save the layer names to a variable called layers
so I can load them whenever I want. You don’t have to, the st_layers()
function will return the list regardless of whether it’s saved.
st_layers()
returns the name of the layer, then what kind of layer it is (Multi Polygon), how many features it contains, and what coordinate system it uses.
The PAD-US geopackage contains seven layers which contain different kinds of information. The type of map you want to create will determine which layer you will use. For my map I want the same information for the state parks as I have for the national parks. I want:
-I have to figure out which of the seven layers listed above includes this information. You can load each layer (using the st_read()
code below), but this method usually causes R to crash because there’s too much information for it to handle.
Instead, I go to the PAD-US Viewer and search for a state park. I am going to choose the Valley of Fire State Park in Nevada because 1) I know it exists, 2) I have been there. You can choose any state park you’ve been to, or choose one from a list of parks by state on Wikipedia.
- - - -In the search bar on the left, type in Valley of Fire (1) and select it from under the Official Place Names (2) panel that appears after you start typing. The map will adjust to show the park and you can click Done (3).
- - - -The default view on the PAD-US Viewer includes the first layer’s information, Fee Managers in the viewer and PADUS3_0Fee in the geopackage. According to the documentation, most public land is “owned in fee” though some land is held in “long-term easements, leases, agreements” or by Congressional designation. To own land “in fee” means to own it completely, without limitations or conditions. The Fee layer lists what agency owns the land in its entirety.
- -I don’t actually care whether it’s in fee or easement (allowed to use someone else’s land), my interest is solely in whether the Park’s polygon is visible when this layer is selected in the viewer.
- - - -The purple-ish polygon on the map shows the geographic boundaries of the Valley of Fire State Park. This is what I want to see. Not all of the layers include the shape data for the state parks.
- -For example, if I uncheck the Fee Manager layer and select the Federal Fee Managers (Authoritative Data) layer the polygon for the Valley of Fire disappears. This is because it is not federal land - it’s state land. Be careful when choosing the layers because not all of them contain the correct information.
- - - -The USGS is interested in fundamentally different information than I am. Finding the right information for my map is going to take some data manipulation. Since the USGS is interested in the more technical aspects of land management, their categories are more detailed than what I need.
- -Using the PAD-US Viewer, I can investigate what information is available without having to load the entire data set into R’s viewer.
- -For this section, I am going to use the Valley of Fire State Park in Nevada, Crissey Field State Recreation Site in Oregon, and the Salton Sea State Recreation Area in California to make sure that I am getting the data I want from the Fee layer. I chose these parks because I have been to them. I know where they are and that they are all (mostly) state land.
- - - -First, I need a good way to filter out national, tribal, or military land. On the PAD-US Viewer I will look at the Valley of Fire State Park information. When you search for it (described above) a marker will pop up. If you click on the marker a table will show on the map that includes all the information available on that layer.
- - - -In the image the values on the left are the variable names - I’ll use these later to select certain variables. On the right is the value for the specific park. I’ve boxed a few of the properties which I’ll check across the different parks to make sure I can get all the information I need.
- -When we look at Crissey Field, many of these fields have different values.
- - - -The ownership type (Own_Type) is the same for both: State. The Mang_Type, Mang_Name, Des_Tp, and (obviously) Unit_Nm are different.
- -Next, we need to check the Salton Sea. The Salton Sea demonstrates the problem with finding appropriate data. Since the Salton Sea is toxic, it’s owned in part by the State of California, the Bureau of Land Management, and the Bureau of Reclamation - each represented by a different color on the map.
- - - -Having visited the park, I can tell you there’s no fences or markers designating the different owners. On paper the Salton Sea may have many different owners, in real life you wouldn’t know it just by visiting the area.
- -On the left hand side of the viewer there’s a color-coded key that indicates the different Fee managers. It’s not very user-friendly because the the colors are very similar.
- - - -I don’t actually care who the fee manager of the Salton Sea is. What I am looking for, though, is whether any part of the Salton Sea includes Own_Type as State. If it does, I can use this column to separate the state data from any other kind.
- -In the viewer, I’ll look for the blueish violet color (boxed in red above).
- - - -After some trial and error I found a location about a quarter of the way down the lake’s right hand side. I am fairly certain I’ve actually been to that exact location. When looking at the table, I can see that the Own_Type matches the Valley of Fire and Crissey Fields which means I can select the state data using this property.
- -Now we’ll move back to R to load the data layer, filter for the 50 states, and for state owned land.
- -1
-2
-3
-4
-5
-6
-7
-8
-9
-10
-11
-12
-13
-14
-15
-
## create territories list
- territories <- c("AS", "GU", "MP", "PR", "VI")
-
- state_parks <- st_read("./shapefiles/original/state_parks_padus/PADUS3_0Geopackage.gpkg", layer = "PADUS3_0Fee") %>%
- filter(State_Nm %!in% territories &
- Own_Type == "STAT") %>%
- filter(Des_Tp == "ACC" |
- Des_Tp == "HCA" |
- Des_Tp == "REC" |
- Des_Tp == "SCA" |
- Des_Tp == "SHCA" |
- Des_Tp == "SP" |
- Des_Tp == "SREC" |
- Des_Tp == "SRMA" |
- Des_Tp == "SW")
-
These lines should be familiar by now, so I will briefly cover them here. For a detailed description of each line, check out part III of this series.
- -Line 2 creates the list of US territories that I filter out in line 5.
- -If you want to read more about this line visit part III of the series.
- -Since we’re working with a geopackage (instead of a shapefile) we have to load the data using st_read()
instead of read_sf()
st_read()
takes two arguments. The first is the path to the geopackage. When we downloaded the PAD-US data and unzipped it, the folder contained 35 files. From those 35 make sure you choose the one with the .gpkg
extension.
The second argument layer =
specifies which layer R should load. Here, I am selecting the layer PADUS3_0Fee layer because I know it contains the shape data from the state parks.
R and the Tidyverse’s filter()
function allows you to combine conditions within one filter call by using or |
or and &
.
The logic in this line says filter the data for rows where the State_Nm is not in the territories list (discard all but the 50 states) and the Own_Type is STAT. For the row to be selected, both conditions must evaluate to true.
- -levels(as.factor(state_parks$Own_Type))
-The unfiltered data set had 247,507 rows. After these the two conditions in this line the data set has 53,139 rows. That’s a significant reduction but still a substantial number of rows.
- -Next, I want to choose certain types of state owned land. For that, I am going to look at the Des_Tp column. According to the PAD-US documentation, the Des_Tp column holds information about the Designation Type. It contains 37 different land designations.
- -I am going to restrict my data to include the following designations:
-