- Learning Apache Cassandra(Second Edition)
- Sandeep Yarabarla
- 340字
- 2021-07-03 00:19:34
Anatomy of a compound primary key
At this point, it's clear that there's some nuance in the compound primary key that we're missing. Both the username column and the id column affect the order in which rows are returned; however, while the actual ordering of username is opaque, the ordering of id is meaningfully related to the information encoded in the id column.
In the lexicon of Cassandra, username is a partition key. A table's partition key groups rows together into logically related bundles. In the case of our MyStatus application, each user's timeline is a self-contained data structure, so partitioning the table by user is a sound strategy.
We call the id column a clustering column. The job of a clustering column is to determine the ordering of rows within a partition. This is why we observed that within each user's status updates, the rows were returned in a strictly ascending order by timestamp of the id. This is a very useful property since our application will want to display status updates ordered by the creation time.
Is sorting by clustering column efficient?
Sorting any collection at read time is expensive for a non-trivial number of elements. Fortunately, Cassandra stores rows in a clustering order; so when you retrieve them, it simply returns them in the order they're stored in. There's no expensive sorting operation at read time.
All of the rows that share the same primary key are stored in a contiguous structure on disk. It's within this structure that rows are sorted by their clustering column values. Because each partition is tightly bound at the storage level, there is an upper bound on the number of rows that can share the same partition key. In theory, this limit is about 2 billion total column values. For instance, if you have a table with 10 data columns, your upper bound would be 200 million rows per partition key.
For further information on data modeling using compound primary keys, the DataStax CQL documentation has a good explanation at http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_compound_keys_c.html.
- Getting Started with Clickteam Fusion
- Photoshop CS3圖像處理融會貫通
- Learning Azure Cosmos DB
- Machine Learning with the Elastic Stack
- Statistics for Data Science
- 從零開始學(xué)C++
- 電腦上網(wǎng)輕松入門
- 電子設(shè)備及系統(tǒng)人機(jī)工程設(shè)計(第2版)
- 統(tǒng)計挖掘與機(jī)器學(xué)習(xí):大數(shù)據(jù)預(yù)測建模和分析技術(shù)(原書第3版)
- 網(wǎng)絡(luò)存儲·數(shù)據(jù)備份與還原
- Flink原理與實踐
- Python文本分析
- 信息系統(tǒng)安全保障評估
- JSP網(wǎng)絡(luò)開發(fā)入門與實踐
- 輸送技術(shù)、設(shè)備與工業(yè)應(yīng)用