Pages

Friday 6 November 2015

Geocoding Frac Sand Mines in Wisconsin

 Goals and objectives

This lab will serve as a continuation of the previous frac sand mining lab. In the previous lab, spatial data was collated for Trempealeau County, Wisconsin. This lab will take raw mine address information via an Excel Spreadsheet and transform it into a map so that the mines can be spatially analyzed for sand mining suitability in subsequent labs, which will be done through the process of geocoding. After geocoding, we will compare our results to our peers and the precise locations (provided by our professor).

The objectives are listed below (figure 1).

Figure 1. Geocoding lab objectives.


Methods

Many steps have to be completed in order to properly geocode. The data that we received in Excel from the DNR was quite messy and the data was not normalized (figure 2). In order to normalize the data we needed to separate the location information into different cells containing: address, city, state, zip code and unique mine ID (figure 3).

Figure 2. Non normalized data in Excel. Note that the address column contains PLSS information as well as the zip code, city, and other information. 

Figure 3. The normalized output table, which was then exported to ArcMap for geocoding.

The data provided for the lab was distributed by the Wisconsin Department of Natural Resources (DNR). Much of the data that they receive from the mines is sporadic and we received that data directly. Some of the locations included the address, city, and zip code and we could easily find the location of the mine; however, other contained only the PLSS. PLSS is the Public Land Survey System (figure 4), which is the grid system that Wisconsin, and much of the United States, has based their land descriptions on. This is why there appears to be a grid-like pattern on aerial imagery (figure 5).

Figure 4. PLSS grid. If the mine sites were given without an address, they instead had a PLSS. The image depicts how  to read a PLSS.(http://geology.isu.edu/geostac/Field_Exercise/topomaps/plss.htm)
Figure 5. Aerial imagery of farmland. Note the grid-like patches that are derived from the PLSS.
http://oklahomafarmreport.com/wire/news/2012/11/media/05086_FarmAerialView06282012.jpg
Once the Excel table was normalized, it was brought in to ArcMap and the Geocoding Extension was turned on. The ESRI address database could then match all potential addresses. 14 of my 19 addresses were found and 5 were unable to be found. Of the ones that were found, the locations were verified by locating each address in ArcMap. If the address point was 'off', then a new point could be created to make the location accurate.

The reason why some could not be found was due to ESRI's inability to find locations based on PLSS data when address information could not provided. Without a hint of where the addresses were located, we needed to turn to the PLSS shapefiles containing all of the information required to locate the PLSS mines. Some PLSS descriptions were straightforward, however some of them gave multiple PLSS descriptions, which made it more difficult to find the exact location of the mine. To attempt to combat this, I looked up the names of the mines to see if their relative locations could be determined using Google Earth, I could then hopefully narrow down the mine location.

In the end of my geocoding, I kept only 3 of the address markers that ESRI found in the initial stages of geocoding. Although many of them were near the mine locations, they were often far away from the road, which is less accurate for the information required in the future.

Once all of the geocoding was complete, I compared my results to my classmates and my professor's exact mine addresses. Organizing the mines to do this was tricky, as there were four other peers that had my mines. To combine their data into one shape file, I went into the attribute table and selected the mines that I was given to geocode and subsequently created a new layer from the selected features. The reasoning for doing such was to be able to sift through less mine locations than if I had performed a merge of all of my peer's mine sites, which would have included mines that I had not geocoded. Another element of inconvenience was that we all named our 'unique mine ID' field a different alias name. Without the same naming scheme, it makes it initially impossible to merge the data. This required creating new fields in each of my peer's attribute tables. In addition, all of the shape files needed to be projected into a projected coordinate system instead of a geographic coordinate system in order to be able to calculate the distance in measureable lengths, not degrees.

The 'merge' tool was then used to create one shape file with my peer's mine information. To determine how accurate my geocoding was comparatively to my peers and the exact locations provided by my professor, I used the 'Point Distance' Tool.

 Results

As expected, my mine locations were 'off' comparatively to the actual mine locations. Below figures 6 and 7 show the output tables after running the 'Point Distance' tool. The distance calculated was in meters.

Figure 6. This was the output table from the point distance tool. The distance shown is in meters shows how far the actual mine location was in comparison to the mines I geocoded. The average error distance was 3,756 meters from my mine sites to the actual mine sites.

Figure 7. The output table from the point distance tool. This comparison was between me and my peer's mine locations. Due to missing mines from some of my peers, I could only show the distance between 13 of the 19 mines.

Below figures 8-10 are the maps for the locations of the mines that I found as well as the actual mines and my peer's mines.

Figure 8. The location of the mines I geocoded and the actual locations of the mines. Overall, my locations were fairly close to the actual mine locations.
Figure 9. My mines in comparison to my peer's. They were mostly spot on, which is a relief to know that the locations that I found were very similar to my peers.
Figure 10. All of the mines that were compared in the process of geocoding.
Discussion

In the end I had a larger mean error with the actual data from the DNR at a distance of 3,756 meters whereas my error mean distance with my peers was 1,030 meters. One of this issues with the number with my peers is that I only had 13 of the 19 mines to compare the distance to because some of my peers did not post their mines to ArcMap. The errors largely in part came from issues in the data automation and compilation areas which included digitizing (or geocoding). This is because geocoding (if you choose to use the points found from the geocoder) finds its addresses generally by estimation, not a precise location. In addition, anyone adding points manually will not have the same points as someone else, so even the 'correct' points could be wrong technically.

Additionally, attribute data input, an error type, could have been incorrect by the DNR or the people that they received the information from. This is very likely considering that the data provided was sporadic and sometimes difficult to understand.

We can know which points are correct by having the lat/long data from the DNR. The majority of the time lat/long data is a fool-proof way of figuring out a particular address location. However, another error type, field survey measurements, could have initially calculated the lat/long data incorrectly, thus giving an incorrect location.

Conclusion

The process of  geocoding is extremely helpful in a spatial analysis, and without it, one would lose accuracy Using geocoding is never precise, especially when one has to manually add the points. This shows that there has to be some consideration when looking at a map for accuracy, which is exactly why it is so important to include the data source as well as the metadata. In addition, some of the data points that were given from the 'correct' mine locations was actually centralized on the mine itself without consideration of the closeness to the road, which is what we were suggested to do for this lab. Basically, you have to take geocoding as a relative form of locating addresses.

Sources
Wisconsin Department of Natural Resources. (n.d.). Retrieved November 8, 2015, from http://dnr.wi.gov

PLSS - Legal Descriptions | PLSS. (n.d.). Retrieved November 8, 2015, from http://www.sco.wisc.edu/plss/legal-descriptions.html 


No comments:

Post a Comment