logo text
ACM TechNews

When the Meteor and the 1PB Database Collide

Computerworld (08/08/08) Lai, Eric

The Panoramic Survey Telescope and Rapid Response System (Pan-STARRS) is a joint venture between the U.S. Air Force and a number of universities that aims to track asteroids and other near-Earth objects through satellite and telescopic observation. Economically compressing, storing, and crunching the massive volume of raw image data produced by Pan-STARRS is an enormous database engineering challenge, and Johns Hopkins University professor Alex Szalay says Pan-STARRS will utilize a cluster of 50 PC servers linked to 1.1 petabytes of disk storage through fast Infiniband networking gear. Pan-STARRS' choice to use Microsoft's SQL Server 2008 for database management rather than a program better established for ultralarge data warehouses is based on factors that include cost and Microsoft's long association with the astronomical community, mostly through the offices of database researcher Jim Gray, who played a critical role in the construction of earlier databases of astronomical and Earth imagery. Pan-STARRS, which Gray partly conceived, will contain 300 terabytes of data by the end of the decade, and Szalay says some individual tables will be as large as 20 TB. Being a clustered system means that the data in Pan-STARRS will be partitioned, with an independent names database functioning as the index. Szalay says most searches will be facilitated through a graphical interface that "looks and feels a lot like MapQuest or Google Maps."

http://www.computerworld.com/action/
article.do?command=printArticleBasic&ar ticleId=9112018


© Copyright 2008 Information, Inc. This service may be reproduced for internal distribution.