Posts Tagged ‘list’
Creating a list of keywords from the attribute table: New & Improved Version
Problem: You need a better way to summarize the contents of a dataset without wasting time typing a long list.
Back in March, I posted about creating a list of keywords from the attribute table. I knew that:
The solution I have come up with is not perfect. I keep thinking there must be a better way.
I had explained how to make a list of the administrative area types in the Natural Earth Admin 1 shapefile, using ArcMap’s “Summarize” tool. The end result was a text file that had extra quotation marks, commas, and other characters to remove.
There is a better way — with Python! I now have the power to structure the text file the way I want it from the start. Here’s the entire code:
import arcgisscripting
gp = arcgisscripting.create(9.3)
report = file(r"C:\gis\AdminTypes.txt", "w")
shape = r"C:\gis\10m-admin-1-states-provinces-shp.shp"
cur = gp.SearchCursor(shape)
row = cur.Next()
adminList = []
while row <> None:
adminList.append(row.ENGTYPE_1)
row = cur.Next()
adminList = list(set(adminList))
adminList.sort()
for admin in adminList:
report.write(admin + "\n")
report.close()
print "Finished"
Change only the portions marked in red to make this work with any data layer. They represent, in order from top to bottom:
- The output text file
- The input data layer
- The name of the attribute table field to summarize
The text file generated by this code is a list of each unique entry in the ENGTYPE_1 field (all duplicates removed) — one per line, in alphabetical order. It’s the exact same thing you would see in the table of contents if you symbolized by category on that field. Only now it’s in a form you can copy and paste into other places, like a metadata record.
Python makes everything better.
Tags: attribute table, keyword, list, python, report, text file
Comparing attribute tables with Python
- Problem #1: You are merging layers that have different attribute table structures and you don’t want to lose any important information.
- Problem #2: You want to use Python to return a list of shapefiles that match a search phrase.
- Problem #3: You want to use Python to compare two lists and return the differences.
The following will solve all three!
I learned a lot from taking Introduction to Geoprocessing Scripts Using Python. When I returned to work, I got the grand idea that I was ready to write a script that would make one looming task more manageable.
As is always the case, the solution seemed easier in my head than it turned out to be when I sat down to write it. The ESRI course had given me just enough information to be dangerous. I had most of the pieces I needed to put together this puzzle, but not all. Those missing pieces are what I would like to share with you today.
The Background
I had created a geodatabase feature class and I needed to append and bunch of shapefiles to it. All of these shapefiles had attribute tables that were close to each other, but not exactly alike. Some had extra fields. I needed to make a list of all of those extra fields so that I could add them to the feature class before I started loading shapefiles.
The Solution
The Python arcgisscripting module has a ListFields() function that forms the basis of my solution. But before I could get to that point, I needed to make a list of the shapefiles whose fields I wanted made into a list.
Depending on your work flow, that might be all the shapefiles in a particular directory. For me, it was all the shapefiles in a particualr directory that matched a file name search phrase. One example: I wanted to merge together all the shapefiles that had “bridge” in the name. My teacher introduced us to the os.walk() function, but I needed to modify it slightly to do exactly what I wanted. Here’s how:
# import modules and create the geoprocessor object
# (some of these you won't use until later)
import arcgisscripting, os, fnmatch, sys
gp = arcgisscripting.create(9.3)
# returns a list of shapefiles in the specified folder
# that match the specified search pattern
folder = r"F:\StagingArea\shapefiles"
pattern = sys.argv[1]
shpList = []
for path, dirs, files in os.walk(folder):
for filename in fnmatch.filter(files, pattern):
shpList.append(os.path.join(path, filename))
In my setup, the search folder is hard-coded and the search pattern is a variable that the user inputs. This worked for my purposes because I was searching through the same folder on a lot of different file names. Modify according to your purposes. Note: The search pattern needs to be in the form *bridge*.shp. You can use * wildcards and you need to put the .shp on the end.
The code so far has only created the list. To make it visible, use this:
gp.AddMessage("Shapefiles returned:")
for shp in shpList:
gp.AddMessage(shp)
Both segments of code assume you will be running the script from within ArcToolbox. Use a print statement if you will be running it from within IDLE or PythonWin. Note that this list returns not just the shapefile names, but their full paths. This makes the list usable as input into the next function.
The code to create a list of attribute field names is pretty simple. Here’s how it looks for my single target feature class (the one I’m going to be appending everything to):
targetLayer = sys.argv[2]
targetList = gp.ListFields(targetLayer)
targetNames = []
for field in targetList:
targetNames.append(field.Name)
To create a list of attribute field names for a list of shapefiles, you add another loop. Here’s how it looks using the shpList I created earlier.
x = 0
fieldNames = []
for shps in shpList:
fieldList = gp.ListFields(shpList[x])
for field in fieldList:
fieldNames.append(field.Name)
x = x + 1
Now we have two lists. The first list is all the field names in the target layer and the second list is all the field names in every layer we want to add on to the target. What we really need is the difference between these two lists. In other words: we need to know which fields appear in the second list and not in the first list.
That sounds pretty complicated, but guess what, you can do it with two lines of code:
diffList = [fname for fname in fieldNames if not fname in targetNames]
diffList = list(set(diffList))
The first line compares the two lists. The second line’s list(set()) function removes any duplicates and alphabetizes the final list of differences.
Finally, to display your results:
gp.AddMessage("Missing fields:")
for field in diffList:
gp.AddMessage(field)
My finished code does a few extra things, like:
- eliminating unmatching geometry types from the shapefile list
- eliminating empty fields from the field name list
- listing the field length and type in the final output
Perhaps I will talk about those in a later entry. What I’ve shown here are the parts that were the most difficult to figure out.
Tags: attribute table, geodatabase, list, python, search, shapefile
Creating a list of keywords from the attribute table
Problem: You need to summarize the contents of a dataset without wasting time typing a long list.
Natural Earth is a crisp basemap dataset available for the world at 1:10m, 1:50m, and 1:110m scales. I downloaded one of their most detailed layers, the 1:10m Admin 1 – States, Provinces shapefile for use as my example.
If you were asked to write a metadata record or some other type of report about this dataset, you would want your description to be as complete as possible. You might find it necessary to generate a long list of place or topic keywords. Often, the attribute table already includes this information. You just need a good way to extract it.
For instance, if I symbolize by category on the ENGTYPE_1 field of my natural earth layer, I get a long list of administrative area types. There’s way more than just states and provinces here! If only I could select this list and paste it somewhere else to have in a usable format. Unfortunately, it’s not that easy.
The solution I have come up with is not perfect. I keep thinking there must be a better way. If any of my readers know of one, I would love the feedback. For now, this method certainly beats having to type the entire list:
First, open attribute table and right click on the field you want to turn in to a keyword list. Then, select Summarize…
Chose one summary statistic field. It doesn’t matter which, but works a little better if you chose something that’s always going to be the same. I chose “CheckMe” because that’s basically a flag (yes or no) field. It doesn’t matter which summary statistic (minimum, maximum, first, last etc.) you pick either. You won’t be using any of this information–it’s just a way to get ArcMap to make a list for you that includes each category one time.
Hit the browse button in the Specify output table section. Then, change Save as type to Text File.
Open up your text file in Notepad and remove the first line (the field names). Then, do Edit — > Replace and replace all commas, quotation marks, numerals 1-9 and whatever text phrase was in your summary statistic field with nothing.
When you are done, you end up with this. If you want to add commas and move these all into one row, it’s pretty easy from here.
Tags: attribute table, keyword, list, report, summarize, text file