One of the greatest challenges in location analytics is the optimization and acceleration of the relational database model. Significant numbers of local, state, and federal government users, combined with substantial commercial interests, have developed relational databases for storage and retrieval of spatial topology and geometry –points, lines, and polygons using IBM’s DB2 technology.
As Real Time Analytical Processing (RTAP) technologies have developed such as InfoSphere Streams and pressure increases to provide immediate insight and decision-support across the enterprise, IBM has responded with a series of technologies to enhance the “in-memory”database. Learning how to leverage massive spatial database investments and connect these resources with the Internet of Things (IoT) is paramount.
“Business at the speed of thought”is a business reality today; a requirement to be competitive. Businesses need to stay in real-time in order to successfully bridge into future time. Add the computationally intensive analytics often associated with spatial vector and raster processing in the spatial RDBMS, the ingestion of millions of GPS-centric transactions a second through streaming protocols such as MQ Telemetry Transport (MQTT), and the importance of user visualization, one finds that businesses are faced with a perfect technical storm. One way to harness the power of the storm is IBM’s DB2 BLU Acceleration.
Improving overall enterprise application performance begins with BLU Acceleration, a cost-effective way of enhancing rich geospatial applications, including a significant trend toward Machine-to-Machine analytics (MtM). There are number of key technical reasons why BLU Acceleration improves the overall performance of DB2 data, helping organization maximize their legacy investment.
First, the new DB2 10.4 “Cancun Release,”utilizes BLU “Shadow Tables” to automatically maintain a column-based version of row-based operational data. Queries are routed to these Shadow Tables that are ideal for fast processing.
Second, BLU dynamically optimizes movement of data from storage to system memory to CPU memory (cache). This allows BLU to maintain in-memory performance even when active data sets are larger that system memory. For those of us who continue to use SQL-queries for our Geographic Information System (GIS) and manage our RDBMS memory and power allocation, this feature is very important; lack of available RAM and DRAM can kill an enterprise GIS WAN or LAN and concurrent user workflow –the system will just freeze and stop.
Third, “actionable compression”maintains the order of the data, enabling compressing data in BLU Acceleration tables to be use without decompression. This allows key and time-consuming spatial functions like relates and joins to be achieved on compressed data –the most frequent values are encoded with fewer bits to optimize the compression. This is also very helpful when processing significant repetitive functions like geocoding, addressing queries, address verification, network routing, network optimization, and address range calculations associated with delivery services, transportation calculations, etc.
Fourth, CPU acceleration has been engineered to process huge volumes of data simultaneously by multiplying the power of the CPU; multi-core, SIMD processor support, and parallelization of data processing are all techniques used to extend the CPU and process with less latency.
Fifth, and very innovative, is a “data skipping”capability wherein small portions of data are examined to determine if the data contains relevant information necessary to perform the analytical functions at hand; for example, do these data have the necessary pixel values I need in order to perform the raster function on the image BLOB? If not, I am not going to waste compute resource addressing that space and raster-features and move on to the next. These are described as “hot”portions or hot zones in the data. Paying attention to these hot zones and ignoring irrelevant data adds a significant “intelligence”layer to the standard DB2 query.
Each of these BLU features in and of themselves would represent less than significant advancements in the field of database acceleration and in-memory computing. However, when combined into an entire feature set and deployed into the DB2 stack, IBM’s BLU greatly improves performance and represents a substantial technical advancement. This advancement allows the organization to extend greatly the value of the DB2 investment and deliver next-generation database driven applications.
As stated, BLU allows the DB2 stack to move into the streaming, MtM, and IoT horizon more effectively and perform at network real-time. It is impractical and preposterous that organizations faced with billions of dollars in database investment will start over and somehow magically drive toward streaming analytics from scratch. Fortunately, BLU Acceleration allows those investments to interact with real-time systems in a more meaningful and computationally effective manner.
This will be more important than ever with GPS-enabled sensor systems and connected device applications associated with the IoT. Furthermore, BLU Acceleration is particularly important when it comes to particular spatial functions and features of the relational database engine.
The core of a true GIS is the relational database and how a topology is maintained, edited, and analyzed. For a GIS to remain relevant over the next five years, technologies like BLU will become more important to the thousands of organizations that are struggling to remain relevant in the demand for instantaneous insights, decision-augmentation, and true machine-to-machine computation.