2021-01-19
For more on setting up the environment and sample data, see the preparation document.
Data | File(s) | Format | Source |
---|---|---|---|
“Nafot” | nafot.shp (+7) |
Shapefile | https://www.gov.il/he/Departments/Guides/info-gis |
Railways | RAIL_STRATEGIC.shp (+7) |
Shapefile | https://data.gov.il/dataset/rail_strategic |
Statistical areas | statisticalareas_demography2018.gdb |
GDB | https://www.cbs.gov.il/he/Pages/geo-layers.aspx |
The data for this tutorial can be downloaded from:
https://github.com/michaeldorman/R-Spatial-Workshop-at-CBS-2021/raw/main/data.zip
A script with the R code of this document is available here:
https://github.com/michaeldorman/R-Spatial-Workshop-at-CBS-2021/raw/main/main.R
All of the materials are also available on GitHub.
Please feel free to ask questions as we go along!
Software in general, and software for spatial analysis in particular, is characterized by two types of interfaces:
In a GUI, our interaction with the computer is restricted to the predefined set of input elements, such as buttons, menus, and dialog boxes. In a CLI, we interact with the computer by writing code, which means that our instructions are practically unconstraned. In other words, with a CLI, we can give the computer specific instructions to do anything we want.
R, which we talk about today, is an example of CLI software for working with (among other things) spatial data.
Figure 1.1: QGIS, an example of Graphical User Interface (GUI) software
Figure 1.2: R, an example of Command Line Interface (CLI) software
R is a programming language and environment, originally developed for statistical computing and graphics. Notable advantages of R are that it is a full-featured programming language, yet customized for working with data, relatively simple and has a huge collection of ~16,000 packages in the official repository from various areas of interest.
Over time, there was an increasing number of contributed packages for handling and analyzing spatial data in R. Today, spatial analysis is a major functionality in R. As of October 2020, there are ~185 packages specifically addressing spatial analysis in R, and many more are indirectly related to spatial data.
Figure 1.3: Books on Spatial Data Analysis with R
Some important events in the history of spatial analysis support in R are summarized in Table 1.1.
Year | Event |
---|---|
pre-2003 | Variable and incomplete approaches (MASS , spatstat , maptools , geoR , splancs , gstat , …) |
2003 | Consensus that a package defining standard data structures should be useful; rgdal released on CRAN |
2005 | sp released on CRAN; sp support in rgdal |
2008 | Applied Spatial Data Analysis with R, 1st ed. |
2010 | raster released on CRAN |
2011 | rgeos released on CRAN |
2013 | Applied Spatial Data Analysis with R, 2nd ed. |
2016 | sf released on CRAN (Section 2.1) |
2018 | stars released on CRAN |
2019 | Geocomputation with R (https://geocompr.robinlovelace.net/) |
2021(?) | Spatial Data Science (https://www.r-spatial.org/book/) |
A question that arises, at this point, is: can R be used as a Geographic Information System (GIS), or as a comprehensive toolbox for doing spatial analysis? The answer is definitely yes. Moreover, R has some important advantages over traditional approaches to GIS, i.e., software with GUIs such as ArcGIS or QGIS.
General advantages of Command Line Interface (CLI) software include:
Moreover, specific strengths of R as a GIS are:
Nevertheless, there are situations when other tools are needed:
mapedit
package)The following sections (1.5–1.11) highlight some of the capabilities of spatial data analysis packages in R, through short examples.
sf
and stars
Reading spatial layers from a file into an R data structure, or writing the R data structure into a file, are handled by external libraries:
sf
: Vector LayersGEOS is used for geometric operations on vector layers with sf
:
Figure 1.4: Buffer function
stars
: RastersGeometric operations on rasters can be done with package stars
:
+
, -
, …), Math (sqrt
, log10
, …), logical (!
, ==
, >
, …), summary (mean
, max
, …), MaskingFigure 1.5: Reprojection of the MODIS NDVI raster from Sinusoidal (left) to ITM (right)
gstat
: InterpolationUnivariate and multivariate geostatistics:
Figure 1.6: Predicted Zinc concentration, using Ordinary Kriging
spdep
: Spatial dependenceModelling with spatial weights:
Figure 1.7: Neighbors list based on regions with contiguous boundaries
spatstat
: Point patternsTechniques for statistical analysis of spatial point patterns (Figure 1.8), such as:
Figure 1.8: Distance map for the Biological Cells point pattern dataset
RPostgreSQL
: PostGISWhen:
we may want to combine R with a (spatial) database.
Package sf
(Section 2.1) combined with RPostgreSQL
can be used to read from, and write to, a PostGIS spatial database. First, we need to create a connection object:
library(sf)
library(RPostgreSQL)
con = dbConnect(
PostgreSQL(),
dbname = "gisdb",
host = "159.89.13.241",
port = 5432,
user = "geobgu",
password = "*******"
)
Then, we can read or write to the database, just like from a file, using the st_read
function (Section 2.3):
st_read(con, query = "SELECT name_lat, geometry FROM plants LIMIT 3;")
## Loading required package: DBI
## Simple feature collection with 3 features and 1 field
## geometry type: POINT
## dimension: XY
## bbox: xmin: 35.19337 ymin: 31.44711 xmax: 35.67976 ymax: 32.77013
## geographic CRS: WGS 84
## name_lat geometry
## 1 Iris haynei POINT (35.67976 32.77013)
## 2 Iris haynei POINT (35.654 32.74137)
## 3 Iris atrofusca POINT (35.19337 31.44711)
dbDisconnect(con)
## [1] TRUE
sf
packageThe sf
package (Figure 2.1), released in 2016, is a newer package for working with vector layers in R, which we are going to use in this tutorial. In recent years, sf
has become the standard package for working with vector data in R, practically replacing sp
, rgdal
, and rgeos
.
Figure 2.1: Pebesma, 2018, The R Journal (https://journal.r-project.org/archive/2018-1/)
One of the important innovations in sf
is a complete implementation of the Simple Features standard. Since 2003, Simple Features been widely implemented in spatial databases (such as PostGIS), commercial GIS (e.g., ESRI ArcGIS) and forms the vector data basis for libraries such as GDAL. The Simple Features standard defines several types of geometries, of which seven are most commonly used in the world of GIS and spatial data analysis (Figure 2.2). When working with spatial databases, Simple Features are commonly specified as Well Known Text (WKT).
Figure 2.2: Seven Simple Feature geometry types most commonly used in GIS (see also: https://r-spatial.github.io/sf/articles/sf1.html)
The sf
package depends on several external software components (installed along with the R package2), most importantly GDAL, GEOS and PROJ (Figure 2.3). These well-tested and popular open-source components are common to numerous open-source and commercial software for spatial analysis, such as QGIS and PostGIS.