Stig: Social Graphs & Discovery at Scale
Share this Session:
  Jason Lucas   Jason Lucas
Software Architect


Wednesday, August 24, 2011
10:30 AM - 11:20 AM

Level:  Technical - Intermediate

Stig is a distributed graph database built from scratch at Tagged. It handles our scale of data and user demands, which means 100M+ users and 6B+ page views per month. It is particularly suited to graph-based applications involving large volumes of data, transactional updates, and inference-driven queries.

The goal of the stig project is to increase the productivity of web programmers. To this end, the system hides the details of its distributed architecture and provides the application programmer a single, consistent, and reliable path to data. The query language is highly expressive and composable, but also easy to use and stocked with helpful libraries.

Benefits to Audience:

In this session, we will introduce you to the architecture, concepts, and language of stig and show you how to integrate it into your projects. In particular, you’ll learn that stig:

  • Represents graphs as graphs instead of as tables or strings, provides a rich type system for describing the contents of nodes, and automatically distributes the work of query evaluation
  • Replaces all-or-nothing commits with conditional points of view and asynchronously increasing levels of durability guarantees, while giving the application programmer as much or as little control as he desires over the flow of causality
  • Uses inferred edges in lieu of relational projections or indices and allows nodes to cluster at named locations, providing simple mechanisms with which to evolve schemas
  • Uses time-domain addressing with explicit assertions in lieu of locking, uses field calls to increase concurrency, and allows queries to make progress even when the client is disconnected
  • Is written in C++0x with a query compiler written in Haskell, has client libraries available for C/C++, PHP, Java, and Python, and is 100% open source

This is a technical presentation appropriate for audiences with intermediate-level experience in NoSQL databases. We’ve used some of this material in a college setting (at the University of Waterloo in Ontario) and at a technical working group in the Bay Area, but this is the first time we’ve pulled it all together in this form.

Jason Lucas is the scalability architect for Tagged, one of the largest social networking system in the world. Jason previously worked for Google on large-scale, distributed systems and for Microsoft on the Visual C++ compiler. He also spent almost ten years working on artificial intelligence systems for treating HIV/AIDS in Africa. These days Jason focuses on problems in the NoSQL space, creating planetary-scale data services that are reliable, fast, cheap, and, if at all possible, easy to use.

Close Window