Understanding quantitative cartography and interpolated sources with historic population data
![]() |
|---|
| Africa ([1600?]), from Library of Congress Geography and Map Division |
Before 6:30pm on Tuesday, 2/20, you should submit to Canvas:
pdf or docx format, answering all the questions that are tagged with This activity will walk you through techniques and best practices associated with representing quantitative geographical data using ArcGIS Pro. You’ll use a pre-processed dataset of population estimates for African countries to visualize population change over the course of a century across the entire African continent.
Major learning objectives include:
At this point, you should be getting pretty good at starting a new project in ArcGIS Pro. Follow these instructions to set up your workspace:
First, I recommend creating a directory in your H: drive for week04, and inside that directory, make two folders for data and workspace. When I do this, it resembles:
├─ gisusers$ (H:)/
├─ week04/
├─ activity02_africa-historic-pop/
├─ workspace/
├─ submission/
You should pick a directory structure that makes sense for how you work, but the key thing to bear in mind is to pick a structure that is as futureproof as possible.
To give a concrete example, I like to store my original datasets in a separate data folder in case I accidentally corrupt my data and need to get a fresh copy.
activity02_africaHistoricPop.workspace folder, leaving the “Create a new folder for this project” box checked.You should now be looking at a fresh ArcGIS Pro project, with the two default layers of a world hillshade and world topo map.
In this activity, you’ll be working with two kinds of geographic data in this activity: tabular data and feature data.
Tabular data refers to data that is saved in table or spreadsheet format. Commonly you’ll encounter this as an xls or csv file. Tabular data is “geographic” when it contains geographic indicators – such as latitude and longitude or an address – but that doesn’t mean it’s spatial data. In order to display it in a GIS like ArcGIS Pro, there are further steps we need to take. Right now it’s only describing these geographies.
Feature data refers to data that is encoded in a spatial format. Common spatial formats include the shapefile (shp), GeoJSON (geojoson), and geodatabase (gdb). All of these formats contain structured spatial data that can be read inherently by many geographic information systems without any further processing. When you load spatial data into ArcGIS Pro, it is automatically recognized and can be instantly plotted as features on the map; hence, feature data.
Almost always, feature data contains some kind of tabular data as well, usually in the form of an attribute table like the kind you explored extensively in Lab 02.
Download the tabular data on population estimates for African countries between 1850 and 1950, either from the week 4 module in Canvas or by clicking the link below:
The tabular data we’ll be working with is a slightly processed version of this dataset. The author has written (Manning 2010:246) that he composed this dataset:
The findings… draw attention to the widespread assumptions of past observers that African populations were relatively small and that they were growing rapidly—in both colonial and precolonial eras. These pervasive assumptions were more than demographic estimates: they emerged out of ideologies that treated African societies as technically backward, politically immature, and socially elemental. Such views of African societies enabled observers to make aggregate generalizations without exploring the details of African social interaction.
As you’ll see in greater detail next week, counting people is a political action. In this case, the author is counting people (e.g., making demographic estimates across a historical time period) in order to debunk Eurocentric narratives about growth and chance in pre- and post-colonial African societies.
When you encounter a new dataset, your first questions should always be: why does this exist? Where did the data come from? How was it computed? These are important questions to clarify before making a map.
If you want, you can read more about how this dataset was actually created in Appendix A of the original xls spreadsheet: https://doi.org/10.7910/DVN/ZP3PP1
Take a moment to open this dataset in Microsoft Excel. You should see something like:

There are lots of places where you could download spatial data of African countries. For this activity, we’ll use Stanford University’s EarthWorks portal, a massive discovery tool for geospatial data. (Alternatives include the Tufts GeoData portal, Harvard’s Dataverse, Princeton’s Maps & Geospatial Data Portal, and the Big Ten Academic Alliance’s Geoportal.)
africa country boundariesExport it as a shapefile

zip file. Extract it and move the contents into your data folderGo ahead and bring the feature data (countries) into ArcGIS Pro, either by:
data folder (or wherever you saved the data) ➡️ drag and drop from Catalog into map/data frame
Then, add the tabular data (spreadsheet) using the same method. Once you’re done, you should see both layers displayed in your Contents pane, which will have a new category for “standalone tables”:

Open the africa_pop_est_1850-1950_cleaned file by right-click ➡️ “Open” (or ctrl+T). Some general observations:
gisname field is supposed to be emptyST_region means “Slave trade region”pop_ columns show decennial population estimates, and they don’t seem to follow changes in historical geographies; rather, they’re all mapped to modern territorial boundariesAlthough we understand intuitively how this “standalone table” should be mapped – e.g., we want to associate the values in the pop_ columns with geographic features on the map, based on the country name – that isn’t enough for ArcGIS Pro. A country name is not inherently spatial data: it’s just geographic information.
Turning geographic information into spatial data is one of the most fundamental and powerful aspects of GIS software. Typically, we call the process of connecting tabular data with spatial data a table join – but before we get to that workflow, let’s take a moment to explore our spatial data as well.
Even the tidiest datasets will often need further pruning, massaging, and TLC. Our two datasets for this activity are no exception.
Take the countries shapefile, for instance, currently displayed as Africa in your contents pane. Now might be a good time to rename the layer to something like “African Countries,” which I’m going to do.
Open the African Countries layer’s attribute table.
Hachi machi! Why, pray tell, are there nearly 5,000 records in this shapefile of African countries?

Go ahead and zoom into one of those features by double-clicking on the table row number on the attribute table’s left-hand side…

… and you’ll probably find yourself looking at a tiny polygon off the coast, like this:

In addition to being pretty heavily generalized, as the jagged edges along the coast indicate, this shapefile appears to have encoded every little island off the coast as a separate record.
There’s nothing wrong with that per se, but for our purposes today, that’s way too detailed: ideally, we just want a shapefile that contains one record for every country in the African continent.
Two methods come to mind for making this data work for you.
ArcGIS Pro offers a wide range of geoprocessing tools for data management. A classic is Dissolve.
Dissolve aggregates features based on specific attributes. For example, the abstract illustration below consolidates features of like color into a single geometry:
![]() |
|---|
| Dissolve tool, from Esri |
Let’s make that abstract example more concrete by applying it to the African Countries layer.
Open the Dissolve tool by clicking into the Analysis pane ➡️ click Tools

Search for the Dissolve tool by typing “dissolve” into the search bar of the geoprocessing pane that opens. Click Dissolve.

Set the tool’s parameters like so…
African CountriesAfricanCountries_Dissolved – save this to africanHistoricalPopulation.gdbNAME
any records that might be selected. If you have any records selected, the geoprocessing operation will only be performed on them.… and click Run.
Now you should have a new layer in your contents pane for AfricanCountries_Dissolved. Open its attribute table – you should now see only 70 records.

Select one of the records and zoom into the coast until you find a few islands. Notice how all the geometries that shared a value in the NAME field have been consolidated even if they’re not touching one another. This is called a multipart feature – or a single feature that contains noncontiguous elements and is represented in the attribute table as one record – and it’s a common result of running the Dissolve tool:
![]() |
|---|
| The coast of Algeria as a multipart feature |
| 1. There are 54 countries in Africa and our dissolved layer has 70 – still more records than we would expect. Use the attribute table to figure out why that is, and write down an example of a feature that is inflating the feature count. Why does it seem like this feature exists? Do you think you need to keep it for this exercise? |
The Dissolve tool is a handy way to automate consolidating features, but sometimes it’s important to get a little bit more precise. For example, let’s say we didn’t want to include all those little islands along the coast at all – can we just delete them?
Obviously, you wouldn’t want to pan through the map, selecting and deleting all nearly 5,000 extra geometries by hand. Instead, to delete these features quickly, you’d want to:
Let’s try…
AfricanCountries_Dissolved and open the attribute table for your original shapefile, African CountriesareaareadoubleNumericAnd save it, either with the Save button in the Fields tab at the top of your screen or by right-clicking in row itself and choosing “Save” (as highlighted below).

Typically, when you are calculating the geometry of any feature or set of features, you’ll want to first ensure it’s projected in a planar coordinate system. If you inspect the source information of our
African Countrieslayer, you’ll note that it is only projected in a geographic coordinate system – not a planar (also known as “projected”) coordinate system. For the sake of brevity in this activity, I’m skipping that step, but it would cause problems later on if we were to continue this analysis.
African Countries, right-click the area ➡️ “Calculate Geometry”. This will automatically open the Calculate Geometry geoprocessing tool. Set the parameters to:
African Countriesarea, while Property = Area (geodesic)Square kilometersGCS_WGS_1984 (you can select this by clicking on the graticule
icon next to the box)area field populate with a bunch of values. Go ahead and sort those values in descending (highest to lowest) order with right-click ➡️ “Sort Descending”.2. Scroll through the attribute table and examine the sorted data, double-clicking on record numbers to zoom into different features. Is there a break point after which it seems like you could delete most of the smaller features without erasing any actual countries? Where is that break point, and why did you choose it? Specify it with the record number and FID (e.g., “Record #90, FID 3500.”) |
Any time you are manually deleting data, you run the risk of deleting the wrong thing. When you do, there’s no going back. For example, you could delete small but very important features accidentally – such as islands in Cabo Verde – if you weren’t being careful. Approach any manual data editing with a healthy dose of caution, and always use the “Discard”
button in the Editing tab if you need to roll back edits.
For the purposes of this activity, we’ll proceed with the dissolved layer. Go ahead and remove the African Countries layer from your project and rename AfricanCountries_Dissolved to African Countries…
… but one last thing before you proceed.
While most of the tabular data we’ll be working with in this activity maps directly to modern geographical boundaries in Africa, there is one exception: Sudan and South Sudan. We’ll need to merge these two features together in order to properly join the data later on.
To merge these two features:
Select them both from the map (hold down the shift key while clicking with the selection tool):

Click the Edit tab and then click the Modify button.

The Modify pane should open up on the right-hand side of your screen. Search for “merge” in the toolbar, and a “Merge” function should pop up in the Construct category. Click it.

African CountriesSOUTH SUDAN and SUDANSUDANClick Merge

Once you’ve merged, you should see something like this:

Notice how there’s still a little section of Central Sudan that has been left separate? Using the skill you just learned, merge that feature into the country boundary for Sudan.
Be sure to preserve the features for SUDAN, not In dispute SOUTH SUDAN/SUDAN. |
The table join is a crucial GIS workflow. Esri describes it as such:
Through a common field, known as a key, you can associate records in one table with records in another table. For example, you can associate a table of parcel ownership information with the parcels layer, because they share a [common] parcel identification field.
![]() |
|---|
| A drawing showing how attributes (the numbers) and features (the shapes) are joined together. The dotted lines represent the “common field” that links the two data (by Tess McCann, 2021). |
This is precisely what we want to do with our data: we want to join the “standalone table” of population data to the African Countries layer.
Joins only work if the common identifier is exactly the same in both tables: there can be absolutely no differences between the common field’s value in either table, or else the data for that record won’t be joined. For example,
Senegalwould not join toSENEGALbecause the latter is all caps. For this reason, it’s typically ideal to use a numerical key rather than a string (e.g., text-based) key.
Moving forward, the major question is, do we have a common field in our two tables?
Because we explored the attribute tables earlier, we know there is information for country name in both tables. Let’s find out how suitable those fields would be as a common field.
African Countries layer and one for your standalone table africa_pop_est_1850-1950_cleanedClick and drag one of those tables into the side-by-side view (an option for “pinning” the table in various locations should open up once you’re begun to drag it), as shown in the gif below:

NAME field in the African Countries table in alphabetical order, from first to last (you can do this by double-clicking on the field name). Do the same with the territory field in the africa_pop_est_1850-1950_cleaned table.| 3. Why won’t this work as a common field for our table join? |
It looks like we’re going to have to make a common field on our own. Although working with data across a historical time range does make this a little more complicated than normal our goal remains pretty straightforward: we want to populate the gisname field in the africa_pop_est_1850-1950_cleaned tablae with values that conform exactly to the NAME field of the African Countries layer.
This is a pretty simple exercise: all you need to do is make the gisname field say the same thing as the name field of the corresponding country. To accomplish this, I recommend:
territory field into the gisname fieldgisname values and conforming them to the NAME field, correcting any spelling or nomenclature discrepanciesLet’s get started…
africa_pop_est_1850-1950_cleaned in Microsoft Excel. (You might need to right-click on the file and select “Open with”…)Go back to ArcGIS Pro. Select all the records in your Africa Countries layer and click the Copy button in the header of the attribute table.

Paste the copied values somewhere in your Excel spreadsheet. Here’s where I dumped mine:

Temporarily, we’re going to start treating our Excel spreadsheet like more of a scratch pad than a data spreadsheet. We’ll move a bunch of things around, with the goal of making the currently empty gisname field conform exactly to the NAME field to which we want this data joined. By the time we’re done, it will be fully joinable and GIS friendly.
Now, right-click on the header above the NAME field, and select “Copy”. That field was in the R column for me, but it might be in a different column for you.

Once it’s copied, right-click on your territory field – it should be the A column – and click “Insert copied cells.” This will paste the copied column to the left of the existing data.

Now you should have a list of the destination field (e.g., the NAME field to which we want to join our spreadsheet data) right next to the input field (e.g., the territory field that we must conform to the NAME field before it’s actually joinable).
You can delete the leftover cells copied from ArcGIS Pro. Your screen should resemble:

Click into the Q2 cell – it should be to the right of your data – and paste the following expression:
=UPPER(B2)
This expression will paste the value from the B2 cell into the Q2 cell, but capitalized.
Copy the Q2 cell itself by right-clicking on the cell ➡️ “Copy” (or ctrl+V) and then paste the contents into cells Q3:Q50 (using either right-click ➡️ “Paste” or ctrl+V), as shown in the GIF below:

Now that you have all your values capitalized, it’s time to paste them where they belong. Copy the new column (cells Q2:Q50 for me) and select the pasting destination: the gisname field, column C, cells 2:50.
Instead of pasting normally, right-click in the highlighted cells (C2:C50) and select “Paste Special” ➡️ “Paste Values”. If you don’t choose this option, the formula – e.g., =UPPER(B2) – will be pasted instead of the values.
Again, the GIF below demonstrates:

Now, read through every value in the gisname column and compare it to its corresponding value in the NAME column to ensure that gisname is exactly the same as NAME. If the two values differ, update gisname to match the value in NAME.
To make this even a little easier, you could cut the NAME column and insert it to the left of the gisname column like so:

Lastly, a couple of important notes on nomenclature for data from the spreadsheet:
THE GAMBIA in the African Countries tableAfrican Countries data (like Cape Verde) aren’t present in the africa_pop_est_1850-1950_cleaned table – that’s okay
Do not change any of the NAMEcolumn values – they are simply reference cells for theAfrican Countrieslayer. Only change values in thegisnamecolumn.
Once you’re done, you can delete the NAME column from your spreadsheet. The final product should resemble:

To complete the join, open ArcGIS Pro and…
africa_pop_est_1850-1950_cleaned and re-add it from the catalog, just to make sure our latest changes are reflected in the file path that ArcGIS Pro is readingWe want to join the standalone table africa_pop_est_1850-1950_cleaned to the African Countries Right-click on the African Countries layer ➡️ “Joins and Relates” ➡️ “Add Join”

When you click “Add Join,” a new dialog box will open up. Fill it out like so:

Click Validate Join. This prints a dialog that shows how many records will be joined. Towards the bottom of that box, you should be able to find something like this message:
Checking for join cardinality (1:1 or 1:m joins)...
A one - to - one join has matched 49 records.
Nice – it sounds like all 49 records from the data should join successfully!
That word cardinality, by the way, is describing the nature of the relationship between the input and join tables. Cardinality can be one-to-one (1:1), in which one record in one table is associated with a single record in the other table, or one-to-many (1:m), in which one record in one table is associated with multiple records in the other table. You can also have many-to-many (m:m) cardinality, which 🥵. We don’t need to dive deeply into cardinality today, but it’s good to be familiar with the term. What is the cardinality of this join?
Anyhoo: let’s click OK to process the join.
If you open the attribute table for the African Countries layer, you should now see a bunch of new fields from our join table – woohoo and congratulations! You’ve just successfully completed a table join, by using a common identifier (country name) to attach tabular attribute data (e.g., decennial population estimates) to spatial feature data (African country boundaries).

Some of the records have values of <Null>, which is okay – it just means there was no joinable data for that record.
|
|:- | Table joins do not save tabular data to feature data; rather, the create a link between the two files. Try opening the Fields view of the attribute table… |
… and you’ll notice that all the fields you just joined are prefixed with
africa_pop_est_1850-1950_cleaned.csv.– in other words, the name of thecsvthat you joined to theAfrican Countrieslayer. | In order to actually save the data, you’ll need to export theAfrican Countrieslayer as a new feature layer (like you did with some of the data in Lab 02). | Do so by right-clicking the layer ➡️ “Data” ➡️ “Export Features”. Name the output file something likeAfricanCountries_Joined. And please, don’t skip this step. It will make your life harder in about 15 minutes. There are geospatial workflows you can’t run on table-joined data layers, such as calculating area, which we’re about to do. | In the Export Features dialog, you’ll also want to unfold the Fields tab and check the box for “Use Field Alias as Name” – this will ensure your field names do not contain the entirecsvfilename as a prefix.
In the next section, we’ll discuss not just how to classify this data, but how classification operates more generally as a powerful tool for getting your message across.
In Lab 02, you learned about symbology: the set of ArcGIS Pro tools that let you style your data with different colors, shapes, and sizes.
Any time you symbolize data, you are implicitly breaking it up into different representational classes (also sometimes called bins or buckets). The techniques for dividing those classes from one another are called data classification.
Start by displaying population for the year 1850 as graduated symbols (refer to the graduated symbols section of Lab 2 or the ArcGIS Pro documentation if you don’t remember how).
You should end up with something like this:

Try toggling the minimum and maximum symbol sizes until it feels right to you.
Now let’s compare this map side-by-side with a representation of the same data in choropleth form.
Even if you don’t know/remember what a choropleth map is, you’ve seen one before. Its name derives from Greek: “bounded space (χώρη or chôra) over which a mass or throng (plethos) is extended” (Crampton 2009:27). Put plainly, “choropleth” means “quantity by area” – and that’s exactly what your choropleth map will show.
On the ribbon, click the Insert tab. In the Project group, click the New Map drop-down arrow:

Map2 – in a new tab.Map2 tab and drag it so that it’s set in a side-by-side view, like so:

AfricanCountries_Joined data layer from the contents pane of Map into the data frame of Map2. This will copy the data layer so that it appears in both map views.
Map2 pane, update the symbology so that you are using a “Graduated color” style instead of a “Graduated symbols” style. Make sure that the “Field” parameter is set to pop_1850. Pick an appropriate color scheme (e.g., a sequential color scheme), and you should see something like this:

| 4. Take a moment to compare the two representational techniques: graduated symbols vs. graduated colors. Which symbology type feels better more intuitive for representing this data? Why? |
Once you’ve answered this question, you can exit the Map tab, keeping only the choropleth map open.
In the symbology tab of your AfricanCountries_Joined layer, click the “Method” drop down and note the different options that you can select for representing your data. You can read about classification schemes in greater detail in this excellent post from Axis Maps. The table below breaks down four major schemes that you’ll commonly encounter and the contexts in which they should be used:
| Method | Description | Problems | Use case |
|---|---|---|---|
| Natural Breaks (Jenks) | Minimize difference within classes and maximize differences between them. | Class ranges are always specific to that data | Unevenly distributed data |
| Quantile (or Quintile) | Places an equal number of observations into each class. | Class ranges can have major size variation | Emphasizing relative positions of data values |
| Equal Interval | Divides the data into equal size classes. | Class ranges are static | Evenly distributed data & comparisons of data across time |
| Standard Deviation | Adds and subtracts standard deviation from the mean of the data. | Only works for normally distributed data | Data with normal distribution |
Now open the histogram for the field pop_1850 (right-click the field name ➡️ “Statistics”). You should see something like this:

The histogram gives us a glimpse into how the data is distributed. Understanding this empowers us to analyze and represent it. Consider:
| 5. Based on your reading of the data (and in no more than 3 sentences), pick one representational technique that you think fits this dataset best. What are its benefits and drawbacks? |
No matter what classification scheme you chose, you’re looking at data that is not normalized; that is, data that hasn’t been adjusted to transform raw counts into ratios.
Because raw counts can be really misleading, normalizing data is a critical part of any mapmaking endeavor. (You might also hear normalization referred to as “standardization” – they can be used interchangeably, but note that normalization in the geospatial context should not be confused with statistical normalization.)
Normalization requires two inputs: a numerator (the attribute value for feature x) and a denominator (the value by which you want to standardize feature x).
The numerator should always be the property you want to display on your map, but there are a few choices when it comes to denominator…
… or put another way:
| Denominator type | Expression |
|---|---|
| By area | (Attribute a for feature x / Area of attributes in all features) = Density of a in x |
| By total | (Attribute a for feature x / Total of attributes in all features) = % of total of a in x |
| By another attribute | (Attribute a for feature x / Attribute b for feature x) = % of a by b for x |
| By time | (Attribute a at time ø for feature x / Attribute a at time ∆ for feature x ) = % change of a between ø and ∆ in feature x |
ArcGIS Pro has a handy parameter in the symbology pane that allows you to set a value for normalization:

Let’s try that out for a few of these: area, total, and time (in this dataset, there’s not really another meaningful attribute [e.g., gender] by which to normalize the population data).
Normalizing by area will allow us to make a map that shows population density, e.g., people by square miles, square kilometers, etc.
If you open the attribute table for AfricanCountries_Joined, you’ll notice that we already have a field for area. However, the values in it don’t make a lot of sense (at least to me)…

… so let’s compute a new one. You’ve done this before – for this very activity, in fact! – so if an area field is not already present in your attribute table, go ahead and follow the instructions from earlier to create one.
You should set “property” to Area (geodesic) and “Area Unit” to square kilometers.
Once you have this new field, you should see values like this:

It kind of looks like that Shape_Area field is ~1/10,000 of the actual area in square kilometers, doesn’t it? Weird. Anyhoo. Here’s the easy (and final) part.
In the symbology pane:
areapop_1860, pop_1920)Natural breaks, Quantile, Equal interval)<percentage of total>) and time (“Normalization” = some other population timestamp field)| 6. In 2-3 sentences, reflect on the differences between the normalization techniques. What does each one highlight and what does each one obscure? |
Before 6:30pm on Tuesday, 2/20, you should submit to Canvas a document in pdf or docx format, answering all the questions that are tagged with and which are summarized below:
| 1. There are 54 countries in Africa and our dissolved layer has 70 – still more records than we would expect. Use the attribute table to figure out why that is, and write down an example of a feature that is inflating the feature count. Why does it seem like this feature exists? Do you think you need to keep it for this exercise? |
2. Scroll through the attribute table and examine the sorted data, double-clicking on record numbers to zoom into different features. Is there a break point after which it seems like you could delete most of the smaller features without erasing any actual countries? Where is that break point, and why did you choose it? Specify it with the record number and FID (e.g., “Record #90, FID 3500.”) |
| 3. Why won’t this work as a common field for our table join? |
| 4. Take a moment to compare the two representational techniques: graduated symbols vs. graduated colors. Which symbology type feels better more intuitive for representing this data? Why? |
| 5. Based on your reading of the data (and in no more than 3 sentences), pick one representational technique that you think fits this dataset best. What are its benefits and drawbacks? |
| 6. In 2-3 sentences, reflect on the differences between the normalization techniques. What does each one highlight and what does each one obscure? |
Crampton, Jeremy. 2009. “Rethinking Maps and Identity: Choropleths, Clines, and Biopolitics.” In Rethinking Maps, Routledge.
Manning, Patrick. 2010. “African Population: Projections, 1850-1960.” In The demographics of empire: the colonial order and the creation of knowledge. Accessed February 4, 2024. https://muse.jhu.edu/chapter/242195.