Performing Table Joins (QGIS3)¶
Not every dataset you want to use comes in spatial format. Often the data would come as a table or a spreadsheet and you would need to link it with your existing spatial data for use in your analysis. This operation is known as a Table Join and is done using the Join attributes by field value
Processing algorithm.
Overview of the task¶
We will use a shapefile of census tracts for California and population data table from US Census Bureau to create a population density map for california.
Other skills you will learn¶
Loading CSV files that do not contain any geometry in QGIS.
Using DB Manager to perform SQL queries to calculate group statistics.
Get the data¶
US Census Bureau provides TIGER/Line Shapefiles. You can visit the FTP site and download census tracts shapefile for California. Download Census Tracts for California file.
Americal FactFinder is a repository of all census data for the US. You can use Advanced Search and query for the Topic - Basic Count/Estimate and Geographies - All Census Tracts in California to create a custom CSV and download it. This tutorial uses TOTAL POPULATION | 2017 ACS 5-year estimates
data.

For convenience, you may directly download a copy of both the datasets from the links below:
Data Source [TIGER] [USCENSUS]
Procedure¶
Locate the
tl_2018_06_tract.zip
file in the QGIS Browser and expand it. Select thetl_2018_06_tract.shp
file and drag it to the canvas.
You will see the layer
tl_2018_06_tract
loaded in the Layers panel. This layer contains the boundaries of census tracts in California. Right-click on thetl_2018_06_tract
layer and select Open Attribute Table.
Examine the attributes of the layer. To join a table with this layer, we need a unique and common attribute for each feature. In this case, the
GEOID
field is a unique identifier for each tract and can be used to link this layer with any other layer or table containing the same ID.
Unzip the
ACS_17_5YR_B01003.zip
file and open theACS_17_5YR_B01003_with_ann.csv
file in a text editor. You will notice that each row of the file contains information about a tract along with the unique identifier we saw in the previous step. Note that this field is calledGEO.id2
in the CSV. You will also note that theHD01_VD01
column has population value for each of the census tract.
Before importing this CSV file, we need to make a minor edit. QGIS CSV importer expects the first row of the file to contain the column headers and all remaining rows to contain the data for these columns. This file contains an extra row 2 with column labels. Delete that row and save the file.
Now we are ready to import the CSV file to QGIS. Go to
.
In the Data Source Manager window, click the … button and browse to the CSV file and select it. Make sure you have selected File format as CSV (comma separated values). Since we are importing this as a table, we must specify that our file contains no geometry using the No geometry (attribute table only) option. Verify that Sample Data preview at the bottom looks fine and click Add followed by Close.
The CSV will now be imported as a table to QGIS and will appear as
ACS_17_5YR_B01003_with_ann
in the Layers panel. Now we are ready to create the table join. Go to .
First we need to change a default setting in the Processing Toolbox. Click the Options button.
In the Processing Options tab, check the Use filename as layer name option. When using algorithms from Processing Toolbox, this option makes the output layer names much more intuitive and useful. Click OK.
Back in the Processing Toolbox, search and locate the algorithm and double-click it to open it.
In the Join Attributes by Field Values dialog, select
tl_2018_06_tract
as Input layer andGEOID
as the Table field. SelectACS_17_5YR_B01003_with_ann
as the Input layer 2 andGEO.id2
as the Table field 2. Leave other options to their default values and click the … button to select the output file location and selectSave to GeoPackage...
.
Name the output geopackage as
joined.gpkg
and the output layer asjoined
. Click Run.
Once the processing finishes, verify that the algorithm was successful and click Close.
You will see a new layer
joined
loaded in the Layers panel. At this point, the fields from the CSV file are joined with the census tracts layer. You can close the Processing Toolbox for now. Right-click on thejoined
layer and select Open Attribute Table.
You will see a new set of fields, including the
HD01_VD01
field containing population estimates.
Now that we have the population data in the census tracts layer, we can style it to create a visualization of population density distribution. Select the
joined
layer and click the Open the Layer Styling Panel button.
In the Layer Styling panel, select
Graduated
from the drop-down menu. As we are looking to create a population density map, we want to assign different color to each census tract feature based on the population density. We have the population in the HD01_VD01 field, but we don’t have population density in any fields to select as the Value. Fortunately, QGIS allows us to input an expression here. Click Expression button.
Opomba
When creating a thematic (choropleth) map such as this, it is important to normalize the values you are mapping. Mapping total counts per polygon is not correct. It is important to normalize the values dividing by the area. If you are displaying totals such as crime, you can normalize them by diving by total population, thus mapping crime rate and not crime. Learn more
Enter the following expression to calculate the population density.
$area
calculates the area of the feature in square meters. We then convert it to square miles and calculate the population density with the formula Population/Area. Click OK.
"HD01_VD01"/ (0.386*$area/1e6)![]()
Back in the Layer Styling Panel, choose a color ramp of your choice and click Classify. You can adjust the class ranges to be more appropriate to the region.
The visualization feels a bit cluttered because of the polygon borders. Click on the dropdown next to Symbol. Select Simple fill and check Transparent stroke.
Now we have a nice looking information visualization of population density in California.