3. 3
Project Background
In 2007, the Pennsylvania Historical and Museum Commission (PHMC) began a multi-year
project to document the agricultural history of the Commonwealth of Pennsylvania. The project
includes narrative histories describing the evolution of different farming systems around the
state, historic census data, a field guide to historic farm buildings and landscapes, and
bibliographic resources.
Project staff digitized manuscripts from the agricultural schedules of the 1850 federal census and
compiled the data for all counties and municipalities in Pennsylvania. A few computed average
fields were also included to aid in documentation for National Register of Historic Places
(NRHP) nominations. The tabulated census data was published online in PDF format for each
county.
Shortly after I joined PHMC in 2011, I conveyed to staff the value of converting and releasing
the agricultural census data in a format that would allow users to analyze the data directly and
create visualizations. Due to limited staff resources the data conversion project remained in the
idea phase. The DAAN 871 interactive dashboard project presented the opportunity to complete
the data conversion.
Example of census data PDF file
4. 4
Data Preparation
Data Extraction
The original Excel files were retrieved from the Pennsylvania State Historic Preservation Office
(PA SHPO), a bureau of PHMC, for 38 counties. Tabula was used to extract the data from 24
counties. Data for Philadelphia municipalities was unavailable in Excel or PDF format so I
manually compiled the data from the original manuscripts.
Example of census manuscript with column totals for a borough in Philadelphia County
5. 5
Data Compilation
An Excel file was created with tabs for each county. Only the original fields from the census
were retained. The computed averages were removed due to inconsistencies with choice and
amount of computed average fields across counties. In the future, a standard list of computed
average fields chosen by PA SHPO will be added back to the county tabs to aid with NRHP
documentation.
The original Excel and PDF files were inconsistent with listing whether a municipality was a
city, borough, district or township. A few counties have both a township and a borough with the
same name so including a field for municipality type is important. The municipality type is listed
on the census manuscript web pages so I added that information to the county tabs. During that
process, I discovered that many boroughs were missing from the tabulated data files. PA SHPO
explained that the primary purpose for the data was for NRHP documentation and since very few
farms currently exist in boroughs the data was not compiled. I manually compiled the data from
the original manuscripts for the boroughs that were missing.
A new tab was created to include data from all of the counties so that the data can be analyzed
for the entire state rather than only on the individual county level.
Data Dictionary
A data dictionary was created with help from the 1850 census documentation. Additional notes
were included for clarity. See Appendix A for a detailed explanation of the schedule 4 headings.
Headings for the 1850 Federal Census, Schedule 4 Productions of Agriculture
7. 7
Interactive Dashboard
Purpose
The Pennsylvania Agricultural History Project explains the significance of the industry:
Farming has guided Pennsylvania's economic growth and cultural development and has
profoundly shaped the lands and people of the Commonwealth. The United States Census
Bureau writes that the census of 1850 was the first for which a special agricultural schedule was
provided.
The purpose of the 1850 Agricultural Production in Pennsylvania interactive dashboard is to give
a high-level overview of agriculture in Pennsylvania during the mid-19th
century. It was the first
time in history that data was collected on agricultural production at a national scale. The primary
user has a casual interest in the subject matter and would like to explore areas of highest
agricultural production in Pennsylvania during that era.
Data
The dataset includes 14 measures for acreage, farm machinery and livestock and 32 measures for
produce. Each measure is tabulated for 1,273 municipalities across 63 counties. The dashboard
includes aggregated data at the county level for all 14 acreage, machinery and livestock measures
and the top 11 produce measures. The produce measures each have a grand total of 1 million or
more bushels, pounds or tons. This is sufficient for a high-level overview. The raw data will be
available for those interested in a deeper dive into all of the produce measures. A shapefile for
historical county boundaries for Pennsylvania is inner joined to the census data.
Analytic Questions
The interactive dashboard answers these analytic questions about agricultural production in
Pennsylvania in 1850:
How many farms were in Pennsylvania? Or each county?
Which counties had higher numbers of farms? Or lower numbers of farms?
What was the cash value of farms in Pennsylvania? Or each county?
What was the value of farm machinery in Pennsylvania? Or each county?
What was the value of live stock in Pennsylvania? Or each county?
What was the value of animals slaughtered in Pennsylvania? Or each county?
How many of each type of live stock did Pennsylvania have? Or each county?
o Horses
o Milch Cows
o Asses and Mules
o Working Oxen
o Sheep
o Swine
o Other Cattle
8. 8
How many acres of improved and unimproved land were in Pennsylvania? Or each
county?
How many total acres of land were in Pennsylvania? Or each county?
How many units of each type of produce did Pennsylvania yield? Or each county?
o Buckwheat
o Indian Corn
o Irish Potatoes
o Oats
o Rye
o Wheat
o Hay
o Butter
o Cheese
o Maple Sugar
o Wool
Design Principles
The centerpiece of the dashboard is a map of Pennsylvania displaying the number of farms in
each county with corresponding shades of green and overlaid on a historical map. A county filter
is included at the top of the dashboard and controls which data is shown on all of the worksheets
below it.
The original spelling and spacing are retained from the census headings. For example, milch
cows instead of milk cows and live stock instead of livestock.
The background design is adorned with historic lithographs and headings are created with the
Abraham Lincoln font to simulate typography from the era. A section at the bottom includes a
few facts about 1850 as well as definitions and a link to learn more about Pennsylvania
agricultural history.
The detailed methodology section has additional information about the dashboard design and
construction.
Detailed Methodology
County Boundaries
The 1850 Pennsylvania agricultural census data is loaded into Tableau, then the farms measure
and county dimension are projected onto a map. Since the 2017 county boundaries are different
than the 1850 county boundaries there are gaps in the map where Cameron, Forest, Lackawanna
and Snyder counties are currently located. Cameron, Lackawanna and Snyder counties did not
yet exist in 1850. Forest County existed but was mostly forest area with no agricultural
production.
9. 9
1850 Census: Total Farms Per County (2017 boundaries)
The Atlas of Historical County Boundaries by The Newberry Library includes datasets of county
boundaries for each state. The KMZ file is loaded into Google Earth to limit the Pennsylvania
county boundaries to only those that existed in 1850. The pared dataset is exported as a KML file
for use in Tableau. The file is converted to a shapefile (SHP) after Tableau would not accept the
KML file. The shapefile data is inner joined with the agricultural census dataset.
Tableau data connections
The geometry measure for the 1850 county boundaries is added to the number of farms
worksheet. All counties are now filled with data except Forest County which did exist but did not
10. 10
have any agricultural production in 1850. Most of the county labels are slightly adjusted to align
within the older county boundaries.
1850 Census: Total Farms Per County (1850 boundaries)
Map Georeferencing
A map dated 1853 by Ensign & Phelps, N.Y. is selected from Historical Maps of Pennsylvania.
The main map section within the Pennsylvania borders is washed out in Photoshop to prevent the
county colors on the map from conflicting with the colors on the visualization when data is
filtered to an individual county. The map is georeferenced in QGIS and imported into Mapbox to
create a custom map background. The Mapbox API is used in Tableau to connect to the map
service and use the map in the dashboard.
11. 11
1853 map of Pennsylvania by Ensign & Phelps, N.Y.
Georeferencing the washed out map
12. 12
Final map in Tableau with custom background and number of farms per county
Imagery and Typography
While searching for historic photos related to agriculture and Pennsylvania in the mid-1800s, I
discovered lithographs of diplomas awarded by local agricultural societies. Two public domain
prints are selected for the dashboard background:
Diploma awarded by the Doylestown Agricultural and Mechanics Institute which depicts
scenes from farm and country life, as well as examples of agricultural produce across
the top and on both sides.
Diploma awarded by the Luzerne County Agricultural Society which shows rural views
of a farm, farmers, and livestock, also arrangements of farm produce, and a farmer
driving a horse-drawn reaper, and a railroad at a factory or processing plant.
The Abraham Lincoln font by Frances MacLeod is chosen for dashboard heading text. The
typeface description: Inspired by the proportions of the 16th President of the USA, and
advertisements/playbills of the 1800s, Abraham Lincoln is a humanistic display face with
moderate contrast and sturdy serifs.
Times New Roman is used for all visualization text and non-heading text, including tooltips.
13. 13
Diploma awarded by the Doylestown Agricultural and Mechanics Institute (c. 1867)
Diploma awarded by the Luzerne County Agricultural Society (c. 1857)
14. 14
Example of Abraham Lincoln font by Frances MacLeod
Limitations and Future Enhancements
The map georeferencing is slightly out of alignment which is most noticeable in the southeastern
part of the state when the map data is added to the custom map background. Several attempts
were made to better georeference the map by using different transformation types and
resampling methods and varying the number of points. In the future, I can work with a colleague
at PHMC that is a GIS specialist to help increase my georeferencing skills and fix the map.
Forest County did not have any agricultural production in 1850 so the map is blank in that spot.
An annotation is used to fill in the county name but I cannot find an easy method to hide the
annotation when only one county is filtered.
There is also no easy method to change the font size for the selected text in the filter dropdown.
Visualizations in Tableau cannot have a transparent background and the color pickers do not
include an option to input exact RGB or hexadecimal color codes. Dashboards do not have a
snap to grid feature. Due to those limitations, background imagery could not be perfectly
blended or aligned with the visualizations. I may work with graphic designer or Tableau
specialist in the future to learn more effective methods of blending dashboard backgrounds and
components.
tiny font size that
cannot be changed
preferred font size
15. 15
I cannot find an easy method to add pop-up informational boxes to the dashboard. Learning how
to swap and pop sheets in a Tableau dashboard is a future goal. As a workaround, an
information icon (lowercase i inside of a circle) for measures that require additional information
animals slaughtered and unimproved/improved land is included next to the visualization with
corresponding definitions at the bottom of the dashboard.
Additional datasets can be added in the future for 1850 population and square miles per county
for further comparisons. Counties with higher populations and square miles tend to have more
agricultural production. As an example, the agricultural production per 100 people or 50 square
miles can be compared.
Despite these limitations, the interactive dashboard for 1850 Agricultural Production in
Pennsylvania turned out well and I am pleased with the final product.
Dashboard Link
The dashboard has been published to Tableau Public:
http://tabsoft.co/2oMy0AU
17. 17
References
Abraham Lincoln by Frances MacLeod. Lost Type Co-Op, n.d. Web. 01 Apr. 2017.
<http://www.losttype.com/font/?name=Abraham%20Lincoln>.
Agricultural Schedules, 1850 to 1900. United States Census Bureau, n.d. Web. 01 Apr. 2017.
<https://www.census.gov/history/pdf/agcensusschedules.pdf>.
Atlas of Historical County Boundaries. The Newberry Library, 2010. Web. 01 Apr. 2017.
<http://publications.newberry.org/ahcbp/index.html>.
Ensign & Phelps, N. Y. 1853 Pennsylvania. Historical Maps of Pennsylvania, n.d. Web. 01 Apr.
2017. <http://www.mapsofpa.com/antiquemaps35.htm>.
Federal Decennial Census, 1850. National Archives, Washington; Record Group 029, National
Archives and Records Service, General Services Administration.
Lebergott, Stanley. "Labor Force and Employment, 18001960." Output, Employment, and
Productivity in the United States after 1800. Ed. Dorothy S. Brady. N.p.: National Bureau of
Economic Research, 1966. 117-21. Web. 01 Apr. 2017.
<http://www.nber.org/chapters/c1567.pdf>.
Pennsylvania Agricultural History Project. Pennsylvania Historical and Museum Commission,
n.d. Web. 01 Apr. 2017. <http://phmc.info/PaAgHistory>.
Pennsylvania. Agriculture Farms and Implements, Stock, Products, Home Manufactures, &c.
The Seventh Census of the United States: 1850. United States Census Office, 1853. 194-198.
Web. 01 Apr. 2017. <http://usda.mannlib.cornell.edu/usda/AgCensusImages/1850/1850a-
08.pdf>.
P.S. Duval & Son, Printer, and James Fuller Queen. Diploma Awarded to [blank] by the
Doylestown Agricultural and Mechanics Institute ... / James Queen ; P.S. Duval, Son & Co. The
Library of Congress, n.d. Web. 01 Apr. 2017. <https://www.loc.gov/item/2015647823/>.
P.S. Duval & Son, Printer, and James Fuller Queen. This diploma was awarded by the Luzerne
County Agricultural Society at their blank annual fair / P.S. Duval & Son's lith. Philada. The
Library of Congress, n.d. Web. 01 Apr. 2017. <https://www.loc.gov/item/2014648443/>.
19. 19
Appendix B: List of Software
Software Purpose
Tabula Data extraction from PDF files
Microsoft Excel Data organization and compilation
Tableau Data visualization and interactive dashboard creation
Photoshop Image editing and creation
Google Earth Data reduction of boundaries for PA counties that existed in
1850
QGIS Map georeferencing
Mapbox Custom map style creation