24  *Lab: land cover statistics

The goal of this lab is to extract statistics about land cover from a raster.

Notebook Setup
  1. In the Tsosie server, start a new JupyterLab session or access an active one.

  2. Create a new notebook in your eds-220-sections directory. Update the name to ‘landcover-stats.ipynb’.

  3. Use the terminal to stage, commit, and push this file to the remote repository.

General directions
  • Add comments in each one of your code cells
  • Include markdown cells in between your code cells to add titles/information to each exercise
About the data

For this task we are going to use two datasets.

First dataset

A small section of the GAP/LANDFIRE National Terrestrial Ecosystems data for 2011, from the US Geological Survey (USGS). This is a raster file with a 30 m x 30 m pixel resolution. Each cell in the raster has a number representing the type of land cover.

The data was pre-processed in the Microsoft Planetary Computer to show a small region around Mount Whitney in California.

Further information about the dataset can be accessed via the the dataset’s Digital Object Identifier (DOI) link:

U.S. Geological Survey (USGS) Gap Analysis Project (GAP), 2016, GAP/LANDFIRE National Terrestrial Ecosystems 2011: U.S. Geological Survey data release, https://doi.org/10.5066/F7ZS2TM0.

Second dataset

The second dataset is a csv file with the name of the land cover associated with each code.

Data access

Download the data from Canvas - week 7. The zip file you download includes a .tif file and a .csv file.

24.1 Land cover statistics

24.2 Import data

  1. Import the raster file and store it in a variable lulc. LULC stands for land use/land cover.

  2. Import the tabular data and store it in a variable class_names.

24.3 Raster exploration

Discuss with your team (at least) three methods to explore your raster and implement them.

24.4 Raster reduction

How many dimensions does lulc cover have? If needed, get rid of unnecessary dimensions and verify your result.

24.5 Percentage of area covered per class

24.5.1 Pixels per class

Use the numpy function np.unique() to get the number of pixels per class in lulc. HINT: check the np.unique() documentation to see what the return_counts parameter does and read the last example.

24.5.2 Organize data

Create a data frame pix_counts with two columns: column one must be the code numbers for the pixels in lulc and column two must be the number of pixels corresponding to each code. HINT: check our class notes.

24.5.3 Add class names

Use the class_names data frame to add the class names to the codes in the pix_counts data frame. Store the resulting data frame as classes.

24.5.4 Percentage covered

  1. Store the total number of pixels in lulc as a variable total_pixels without hard-coding any numbers. This means, calculate total_pixels from attributes of lulc.

  2. Add the percentage of area covered by each class as a new column percentage to the classes data frame. Round the percentage to 8 decimal points. HINT: check our class notes.

  3. Discuss with your team how would you check that the total number of pixels in the lulc raster is actually distributed across the classes dataframe. i.e. How would you check that you did not lose any pixel when counting the unique pixels per class or adding the class names? Make one such check.

24.6 Plot

  1. Create a horizontal bar plot showing the classes with more than 1% land cover in decreasing order (longest bar should be at the top). The names of the classes should be the tick labels of the vertical axis.

  2. ✨🐍✨ Try redoing your plot as a one-liner, without creating any intermediate variables.