I came across this post the other day, and it made flashy things go on and off inside my braincase as normally underused neurons woke up and stretched lazily (do click on the blue letters and read the post). While I agree with the crux of the above linked post, the light show inside my skull was actually related to (mostly) other ideas. In my usual, intensely dull, Map Dorkish manner, I was thinking about data.
Really. It’s something I think about. A lot. It’s a sickness.
Anyway, I got to thinking about a discussion I had with a fellow Map Dork on Twitter a short while ago, about data and GIS. About how the majority of the GIS community spends the bulk of its time thinking about what to do with data, and not enough time thinking about the quality of the data itself.
It’s like this – whenever I make a map, there are two primary components involved in the process. The first is the software that produces said map. The leader in the field is far and away Esri, the company that produces ArcGIS (which used to be known as ArcView). Esri does not produce my software of choice, for a variety of reasons, none of which should be taken as a comment on the software itself (okay – some of it should, but not a lot. Maybe 30% or so). Truth is that Esri wins Best In Show when it comes to proprietary software.
In Map Dorkia, though, proprietary software doesn’t carry the kind of weight it does in other fields. You see, a fair number of Map Dorks also happen to be coders (maybe even most of them). Because of this, the market has been flooded with a vast number of good, stable, working, free and open source alternatives. I can’t begin to mention them all, but I will point you to this site, where someone better informed than myself has put together some good overviews (even if parts of them are bit out of date).
At the end of the day, my go-to GIS application is Quantum GIS (although it’s far from the only one I use). Like the Esri offerings, Quantum GIS is a good, all-around GIS package (but not as feature-packed). Unlike EsriWare, Quantum GIS has a huge, talented support base. Everyone who’s working on Quantum GIS is doing so because they care, not just to get a paycheck. Think about that.
The second component of any map I make is the data with which I make the map. This data comes in many shapes and sizes, as well as different formats and/or projections. The lion’s share of what I actually do involves taking all that crap and turning it into an accurate, useful and (hopefully) visually pleasing map. The problem that Map Dorks run into at this point is: Where to get the data?
Often, we turn to the federal government. The USGS has been producing quality maps almost since the Boston Tea Party, so we tend to think of them as a pretty safe bet. However, it’s wise to check the fine print on the quadrangle you’re looking at. Around these parts, they generally date back to the sixties, although many of them were updated in the eighties or nineties.
Our government also provides census data, also known as TIGER (Topologically Integrated Geographic Encoding and Referencing system) files. TIGER data comes in a variety of shapes and sizes, and is of varying accuracy (see below).
These days, most state governments have some sort of GIS department, as do many cities and towns. These tend to be more accurate than federal sources (although not always) due mainly to the fact that they have a much smaller area of focus. And, of course, some are better than others. Here in Massachusetts, we are lucky to have MassGIS. While MassGIS can be rather quirky (their file naming conventions leave a bit to be desired), they freely offer a wealth of data that tends to be pretty accurate (I know because I’ve checked a fair amount of it on the ground). They do have a budget, however, so some of their data gets a little old between updates. And while they offer tons of data via WMS, their servers – well – suck.
For my money, the most accurate data around (besides the data I go out and gather myself, of course) is that which comes from OpenStreetMap. Steve touched upon this in the post mentioned previously, but it bears repeating. Because data sources are many and various, it is often difficult to assess the accuracy of the data in question (especially if it’s data depicting an area geographically removed from your own location).
What makes OSM (OpenStreetMap) unique among data providers is the workforce that acquires the data. The OSM workforce isn’t comprised of people looking only for a paycheck. The OSM workforce doesn’t daydream about something else while they’re gathering data. The OSM workforce is extraordinarily focused on the job at hand because they are only doing it because they really want to do it. They also really want the data to be accurate.
Possibly the most important aspect of the OSM workforce is their proximity to the area they provide data about. In the majority of cases, OSM data is collected by people who can vouch for the accuracy of their data because they can see it out their window or because they walked by it on their way home from work. When it comes to the OSM workforce, the person who mapped any given road has most probably walked down that road.
Because of the nature of the OSM workforce, I tend to trust the accuracy of OSM data more than most. To my mind it’s just plain common sense. And in my experience, OSM data is at least as good as any other source, usually better. Here’s a comparison of road data from three sources:
You can see the obvious shortcomings of the TIGER data. You will probably also note the similarity between the MassGIS data and the OSM data. This is because MassGIS (bless their little hearts) handed a bunch of data to OSM many moons ago (I don’t know exactly when this occurred). While this is a great thing for Massachusetts, not all of America was so lucky. And in my experience, even here in Massachusetts OSM data tends to be more up to date than MassGIS’s (the primary reason for this, I think, is that MassGIS dedicates the lion’s share of their budget to flashy projects. For instance, they just finished gathering new, state-wide aerial imagery – most at 30cm/pixel, some at 15cm. While the imagery is very cool and very useful, OSM will probably get around to utilizing it before MassGIS does).
As luck would have it, you don’t have to take my word for this. Bing maps just rolled out a new feature: an OpenStreetMap layer. I did a quick comparison:
This pretty much speaks for itself. Not only is the OSM data more accurate (note the British Rail lines on the left, as well as the placement of the Oxford Canal), but OSM provides far more information than the Bing data (without overcrowding the map). In pretty much all ways, it’s just plain better data.
And before anyone points to the fact that OSM started in Great Britain (so of course OSM data is better over there), here’s a section of Boston I visited just the other day:
Kudos to Microsoft for including the OSM layer. By all means head on over to Bing maps and check it out. It’s nice to see that they’ve finally figured out what many of us Map Dorks figured out long ago:
Always use the best data you can get your hands on.