This lab will serve as a continuation of the previous frac sand mining lab. In the previous lab, spatial data was collated for Trempealeau County, Wisconsin. This lab will take raw mine address information via an Excel Spreadsheet and transform it into a map so that the mines can be spatially analyzed for sand mining suitability in subsequent labs, which will be done through the process of geocoding. After geocoding, we will compare our results to our peers and the precise locations (provided by our professor).
The objectives are listed below (figure 1).
Figure 1. Geocoding lab objectives. |
Methods
Many steps have to be completed in order to properly geocode. The data that we received in Excel from the DNR was quite messy and the data was not normalized (figure 2). In order to normalize the data we needed to separate the location information into different cells containing: address, city, state, zip code and unique mine ID (figure 3).
Figure 2. Non normalized data in Excel. Note that the address column contains PLSS information as well as the zip code, city, and other information. |
Figure 3. The normalized output table, which was then exported to ArcMap for geocoding. |
Figure 4. PLSS grid. If the mine sites were given without an address, they instead had a PLSS. The image depicts how to read a PLSS.(http://geology.isu.edu/geostac/Field_Exercise/topomaps/plss.htm) |
Figure 5. Aerial imagery of farmland. Note the grid-like patches that are derived from the PLSS. http://oklahomafarmreport.com/wire/news/2012/11/media/05086_FarmAerialView06282012.jpg |
The reason why some could not be found was due to ESRI's inability to find locations based on PLSS data when address information could not provided. Without a hint of where the addresses were located, we needed to turn to the PLSS shapefiles containing all of the information required to locate the PLSS mines. Some PLSS descriptions were straightforward, however some of them gave multiple PLSS descriptions, which made it more difficult to find the exact location of the mine. To attempt to combat this, I looked up the names of the mines to see if their relative locations could be determined using Google Earth, I could then hopefully narrow down the mine location.
In the end of my geocoding, I kept only 3 of the address markers that ESRI found in the initial stages of geocoding. Although many of them were near the mine locations, they were often far away from the road, which is less accurate for the information required in the future.
Once all of the geocoding was complete, I compared my results to my classmates and my professor's exact mine addresses. Organizing the mines to do this was tricky, as there were four other peers that had my mines. To combine their data into one shape file, I went into the attribute table and selected the mines that I was given to geocode and subsequently created a new layer from the selected features. The reasoning for doing such was to be able to sift through less mine locations than if I had performed a merge of all of my peer's mine sites, which would have included mines that I had not geocoded. Another element of inconvenience was that we all named our 'unique mine ID' field a different alias name. Without the same naming scheme, it makes it initially impossible to merge the data. This required creating new fields in each of my peer's attribute tables. In addition, all of the shape files needed to be projected into a projected coordinate system instead of a geographic coordinate system in order to be able to calculate the distance in measureable lengths, not degrees.
The 'merge' tool was then used to create one shape file with my peer's mine information. To determine how accurate my geocoding was comparatively to my peers and the exact locations provided by my professor, I used the 'Point Distance' Tool.
Results
As expected, my mine locations were 'off' comparatively to the actual mine locations. Below figures 6 and 7 show the output tables after running the 'Point Distance' tool. The distance calculated was in meters.
Figure 8. The location of the mines I geocoded and the actual locations of the mines. Overall, my locations were fairly close to the actual mine locations. |
Figure 9. My mines in comparison to my peer's. They were mostly spot on, which is a relief to know that the locations that I found were very similar to my peers. |
Figure 10. All of the mines that were compared in the process of geocoding. |
Discussion
In the end I had a larger mean error with the
actual data from the DNR at a distance of 3,756 meters whereas my error mean
distance with my peers was 1,030 meters. One of this issues with the number
with my peers is that I only had 13 of the 19 mines to compare the distance to
because some of my peers did not post their mines to ArcMap. The errors largely
in part came from issues in the data automation and compilation areas which
included digitizing (or geocoding). This is because geocoding (if you choose to
use the points found from the geocoder) finds its addresses generally by
estimation, not a precise location. In addition, anyone adding points manually
will not have the same points as someone else, so even the 'correct' points
could be wrong technically.
Additionally, attribute data input, an error type,
could have been incorrect by the DNR or the people that they received the
information from. This is very likely considering that the data provided was
sporadic and sometimes difficult to understand.
We can know which points are correct by having the
lat/long data from the DNR. The majority of the time lat/long data is a
fool-proof way of figuring out a particular address location. However, another
error type, field survey measurements, could have initially calculated the
lat/long data incorrectly, thus giving an incorrect location.
Conclusion
The process of geocoding is extremely helpful
in a spatial analysis, and without it, one would lose accuracy Using geocoding
is never precise, especially when one has to manually add the points. This
shows that there has to be some consideration when looking at a map for accuracy,
which is exactly why it is so important to include the data source as well as
the metadata. In addition, some of the data points that were given from the
'correct' mine locations was actually centralized on the mine itself without
consideration of the closeness to the road, which is what we were suggested to
do for this lab. Basically, you have to take geocoding as a relative form of
locating addresses.
Sources
Wisconsin Department of Natural Resources. (n.d.).
Retrieved November 8, 2015, from http://dnr.wi.gov
PLSS - Legal Descriptions |
PLSS. (n.d.). Retrieved November 8, 2015, from
http://www.sco.wisc.edu/plss/legal-descriptions.html
No comments:
Post a Comment