Part of OfficeSpace
Hydra
Massive-scale commercial-real-estate listing data processing system at OfficeSpace.com.
Hydra was the data backbone of OfficeSpace.com - a large-scale pipeline that ingested commercial-real-estate listing data from a wide range of semi-structured and inconsistent sources, then cleaned, deduplicated, normalized, and synthesized it into a coherent, queryable dataset. The hard problems were scale and messiness: reconciling overlapping records, handling source-specific quirks, and keeping the synthesized view fresh as inputs changed. It fed the listings that the web app served to 100K+ monthly users.