The Constitution of the United States requires a census to be taken regularly to determine congressional apportionment, Electoral College voting and government funding. With the information gathered, the Census Bureau seeks to be the leading source of quality data about the nation’s people and economy, and the need to understand and geospatially represent the growing quantities of data requires adequate data management.
The Constitution of the United States requires a census to be taken regularly to determine congressional apportionment, Electoral College voting and government funding. Under the responsibility of Secretary of State Thomas Jefferson, the first U.S. census was administered in 1790. The first census was taken by U.S. marshals on horseback, counting 3.9 million inhabitants.
As America grew, the nation’s interests became more complex, needing statistics to help people understand what was happening and to plan for growth. Because of this, the content of the decennial census has continued to change accordingly. With the information gathered, the Census Bureau seeks to be the leading source of quality data about the nation’s people and economy.
Transitioning from Legacy Data Management Systems
The need to understand and geospatially represent the growing quantities of data requires adequate data management. Since the late 1980s, the Census Bureau has utilized the Topologically Integrated Geographic Encoding and Referencing (TIGER®) System to provide geographic support for surveys, censuses, estimates and partnership programs. The Census Bureau developed a Database Management System (DBMS), called the TIGER database (TIGERdb) to include geographic features such as roads, railroads, geographic areas, landmarks, waterways and other relevant geographic information. A separate database, the Master Address File (MAF) was developed to include an inventory of all residential and business addresses in the United States, Puerto Rico and associated islands (without a geospatial component).
When developed, TIGER and MAF were innovative databases, utilizing persistent topology, automating production of digital mapping products, allowing batch processing for automated updates, providing efficient retrieval and spatial indexing. However, as technology progressed, the separation of the TIGER and MAF presented a number of challenges. In addition, both databases had integration problems with commercial off-the-shelf (COTS) tools and web technology, and were cumbersome to change, difficult to learn, did not allow multi-user access and were not accessible via a standard query language.
Using Oracle to Merge the Databases
Faced with data management concerns, the U.S. Census Bureau decided to replace the legacy TIGER and MAF systems with a commercial software solution, providing a seamless integration of the non-spatial MAF with the spatial TIGER database.
In 2003, the U.S. Census Bureau selected Oracle’s technology to redesign the MAF/TIGER database. Utilization of the Oracle Spatial Topology Model provides a single national storage, seamlessly merging geospatial data (with associated feature attributes) with non-spatial residential and business address datasets. This solution provides an integrated and improved approach to scalability and database management, including replication, archiving and tuning.
This technology also provides web-based enabling of enterprise spatial data in a single user interface. The interactive update system allows multiple users access to the merged database, from multiple sites, including the 12 Census Regional Offices and the National Processing Center.
|GATRES provides web-based enabling of enterprise spatial data in a single user interface.|
From a technological perspective, the Census Bureau selected the Oracle Spatial Topology Data Model because it:
• Offers persistent topology for storing nodes, edges and faces
• Includes vertical topology, allowing multiple feature layers to share the same primitives
• Provides performance features such as partitioning, RAC, and the Application Server
• Enables topology hierarchies
These were all features that were lacking in other COTS topology management solutions.
Selecting the Appropriate COTS Solution
In 2004/2005, after implementing the Oracle technology, the Census Bureau began searching for an enterprise solution for an interactive update system, evaluating available COTS products in the following areas:
• Ease of customization
• Topological maintenance
• Scale dependent display
• Concurrent access and locking mechanisms
• Scalability and load balancing
• Support for long transactions
• Temporal versioning
• Need for user training
• Technical support
ERDAS ADE was rated most favorably relative to the other products, especially in terms of ease of customization, topological maintenance and scalability. ERDAS ADE is the only native Oracle (MapViewer) solution that supports direct editing of Oracle’s Topology Data Model. ERDAS ADE also satisfied the Census Bureau’s customization requirement, providing a fine-grained Java API to implement new features or override existing features. ERDAS-provided code streamlined the selection of primitives, topological editing, database access and map manipulation, allowing flexibility to implement classes.
|Here a user selects a specific feature classification for viewing and editing.|
Updating the System Interactively
One of the most important requirements of the Census Bureau was to select a solution providing an interactive update application. ERDAS ADE was specifically designed to allow multiple users to simultaneously view and edit the same feature in Oracle’s Topology Data Model.
After selecting ERDAS ADE, the Census Bureau customized the data editor to create an interactive application called Geographic ADE-based Topological Real-time Editing System (GATRES). This solution provides tools for selecting and identifying features and updating MAF units, creating links to other features. In addition, the application includes the ability to add, modify and delete:
• Point, linear and area features
• Primitives that form the basis of such features
• Address range information
Users specify the display area by searching by a geographic area such as a county, feature name or geographic coordinates. GATRES allows users to pan across, zoom in and out from the displayed map area, save a zoomed view and zoom to the original map view.
|Here the user selects and edits area features.|
Managing Topology, Features, and Metadata
While offering greater access, enterprise capabilities also need to be accompanied by the appropriate data management tools. With ERDAS ADE, the Census Bureau has a visual tool for managing primitive attributes and tracking, spatial and non-spatial features and metadata. Users work with address ranges linked to the spatial data in MAF/TIGER.
The Census Bureau is also employing other batch applications to update and manage MAF/TIGER. All applications (including GATRES) use one or more of the following components:
The topology management system :
• Accesses update permission
• Ensures updates are allowed by Census topology business rules
• Assess how changes impact features
• Understands relationships between primitives
• Assigns IDs
The feature management system provides:
• Feature standardization of addresses and feature names
• Geocoding and address matching
• Metadata updates
The session and metadata management includes:
• Operational history of adds, deletes and updates
• Global metadata management (how data was collected, when it was collected and by whom)
• Session metadata
The development framework provided by ERDAS saved the Bureau months of software development effort. Only a web browser is needed to accompany ERDAS ADE – there is no other software requirement.
|At a zoomed out view, the feature class and action menus are disabled at this scale.|
The Future of GATRES for U.S. Census Bureau
Although initially the U.S. Census Bureau made the decision to use handheld computers to collect data from the field, in April 2008 the Census Bureau announced that because of “significant schedule, performance and cost issues” (http://www.gcn.com/print/27_8/46110-1.html) most field collection would be done using paper and pencils. Although technologically, paper and pencils are about as simple as it gets, when it comes to entering the information collected from the field – including geographic data – into a database, the Bureau is faced with an entirely new set of challenges.
With these new challenges looming, the shift in thinking and the great technological advances made at the Bureau (namely the use of Oracle Spatial and ERDAS ADE), GATRES will become one of the most important tools used during the census count. To meet these challenges, new versions of GATRES will include the ability to use scanned maps and images from any projection as backdrops within a map. Via GATRES, the Census Bureau will be able to accurately and effectively digitize paper-based information to be stored in the Oracle database.