I don’t love Python for Geospatial Projects 🐍

Last night I went back for a second a look at the world of Geospatial Technology in Python. While Python’s ArcGIS and QGIS bindings are widely touted — and are best way to automate things within ArcGIS or QGIS — you are much better off using R programming language for quick, low-code GIS tasks outside of ArcGIS or QGIS.

Python has a lot of advantages for certain things:

  1. It is a good scripting language that is widely supported in applications.
  2. Python is generally a stronger language for building applications to run on web servers
  3. Both ArcGIS and QGIS have really good Python bindings

But as a stand-alone platform, the Python Geospatial libraries rather suck, and are undeveloped. To be sure you can make maps in Python, you can preform various geospatial operations like transformations, raster math and geometric operations. But it takes a lot of work within Python to get nice looking maps using matplotlib, and but you don’t have access to wealth of Census shapefiles or Census data at your finger tip, and Python’s dot chaining method isn’t necessarily as elegant or readable.

I would argue that the R Programming Language and RStudio are superior in many ways over working directly with Python:

  1. R Programs using tigris library, which gives you instant access to the Census Bureau TIGER/Line with just a single command that can be easily joined again or queried against other data. There is nothing like tigris in Python. If you want to County or County Subdivision lines in Python, you will have to manually download the shapefile and then load it into Geopandas. I’ve looked for things like tigris in Python and it doesn’t exist. The basis of most maps in my experience comes from Census TIGER/Line at least in United States. Cartograpy in Python does have access to Natural Earth Dataset, but that isn’t as good as TIGER/Line in the United States.
  2. There are Census Libraries in Python but they aren’t nearly as up to date, have access to nearly as much Census data or Census TIGER/Line. A lot of maps that you make involve plotting Census data, and that requires both the TIGER/Line and the raw data. tidycensus joins them together as one command, no need to download the TIGER/Line separately like in Python.
  3. While you can chain commands in Python and GeoPandas, the chaining mechanism in R is much stronger and flexible. Often in R you can exchange, transform, load and output a map in a single chain of R commands using the tidyverse and ggplot.
  4. ggplot2 is vastly superior to matplotlib for making maps. ggplot2 has sensible defaults, the output is clean and easily theme-able. ggplot2 main limitation is that is best for simpler, easy to read SVG maps. ggplot2 can be a bit strict in enforcing how Hadley Wickham thinks a map should be presented.Β  matplotlib is more flexible in overlaying and designing maps. Of course, for complicated maps, it’s still best to export the data as a shapefile or geopackage and load it into a full GIS platform like QGIS or ArcGIS.
  5. In general, R has a quirky syntax with cute and weird names, but with more sensible defaults, it often gets geospatial work done with less code and work then the same thing done in Python with Rasterio or GeoPandas. A lot of complicated exchange-load-transform things can be done in one line of code with R. People say that Python is a compact syntax, but it really isn’t compared to R’s geospatial libraries.

I still use Python for some QGIS work but I don’t recommend it for work outside of QGIS or ArcGIS. Python should be seen as a good language to automate processes within QGIS or ArcGIS but the state of geospatial tools in Python is weak when you get away from those automating those graphical GIS applications. If you want your work to end up in a graphical GIS program for additional manual tweaking after automating things, then use Python. But if you are processing GIS data from start to finish, your best bet is R.

Leave a Reply

Your email address will not be published. Required fields are marked *