Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Cone Searching & HTM

Many of the queries we want to do are related to the position of objects in the sky. We use HTM - Hierarchical Triangular Mesh - to index the sky and utilise the API to augment our SQL queries. In this document I will attempt to use HTM with Cassandra. This may of course be a bad match of technologies (see issues below), and there may be other spatial indexing technologies that tie in better with Cassandra.

...

Cassandra Data Model

Relational Data Model

Keyspace

Database

Column Family

Table

Partition Key

Primary Key

Column Name/Key

Column Name

Column Value

Column Value

...

Note the “partition key” vs primary key terminology - we’ll come back to that. The partition key determines how the data is distributed across multiple nodes.

In Cassandra, replication is built in, and the replication factor determines how many other nodes the data is copied to. In the following example, objects with partition key 10 would be stored in Node 1. If the replication factor is 3, then the data is replicated (clockwise) in to Nodes 2 and 3. Likewise, an object with a key value of 83 would be stored in Node 4, and replicated to Nodes 1 and 2.

...

My installation of Cassandra is just a single Node onto my laptop. However, it seems surprisingly robust, if not quick. I’m still figuring out the way to insert and read data multithreaded, so for the time being all access is single threaded. I’m sure times can be improved.

Partition Keys and Clustering Keys