- Mastering MongoDB 3.x
- Alex Giamas
- 562字
- 2021-08-20 10:10:47
Schema design best practices
MongoDB is schema-less and you have to design your collections and indexes to accommodate for this fact:
- Index early and often: Identify common query patterns using MMS, Compass GUI, or logs and index for these early and using as many indexes as possible at the beginning of a project.
- Eliminate unnecessary indexes: A bit counter-intuitive to the preceding suggestion, monitor your database for changing query patterns and drop the indexes that aren't being used. An index will consume RAM and I/O as it needs to be stored and updated alongside with documents in the database. Using an aggregation pipeline and $indexStats a developer can identify indexes that are seldom being used and eliminate them.
- Use a compound index rather than index intersection: Querying with multiple predicates (A and B , C or D and E and so on) will most of the time work better with a single compound index than with multiple simple indexes. Also, a compound index will have its data ordered by field and we can use this to our advantage when querying. An index on fields A,B,C will be used in queries for A, (A,B), (A,B,C) but not in querying for (B,C) or (C) .
- Low selectivity indexes: Indexing a field on gender for example will statistically still return half of our documents back, whereas an index on last name will only return a handful of documents with the same last name.
- Use of regular expressions: Again, since indexes are ordered by value, searching using a regular expression with leading wildcards (that is, /.*BASE/) won't be able to use the index. Searching with trailing wildcards (that is, /DATA.*/) can be efficient as long as there are enough case sensitive characters in the expression.
- Avoid negation in queries: Indexes are indexing values, not the absence of them. Using NOT in queries can result in full table scans instead of using the index.
- Use partial indexes: If we need to index a subset of the documents in a collection, partial indexes can help us minimize the index set and improve performance. A partial index will include a condition on the filter that we use in the desired query.
- Use document validation: Use document validation to monitor for new attributes being inserted to your documents and decide what to do with them. With document validation set to warn, we can keep a log of documents that were inserted with arbitrary attributes that we didn't expect during the design phase and decide if this is a bug or a feature of our design.
- Use MongoDB Compass: MongoDB's free visualization tool is great to get a quick overview of our data and how it grows across time.
- Respect the maximum document size of 16 MB: The maximum document size for MongoDB is 16 MB. This is a fairly generous limit but it is one that should not be violated under any circumstance. Allowing documents to grow unbounded should not be an option and as efficient as it may be to embed documents, we should always keep in mind that this should be under control.
- Use the appropriate storage engine: MongoDB has introduced several new storage engines since version 3.2. The in-memory storage engine should be used for real-time workloads, whereas the encrypted storage engine should be the engine of choice when there are strict requirements around data security.
推薦閱讀
- Microsoft Power BI Quick Start Guide
- 中文版Photoshop CS5數(shù)碼照片處理完全自學(xué)一本通
- 2018西門子工業(yè)專家會(huì)議論文集(上)
- 大數(shù)據(jù)時(shí)代的數(shù)據(jù)挖掘
- Expert AWS Development
- STM32G4入門與電機(jī)控制實(shí)戰(zhàn):基于X-CUBE-MCSDK的無(wú)刷直流電機(jī)與永磁同步電機(jī)控制實(shí)現(xiàn)
- 最簡(jiǎn)數(shù)據(jù)挖掘
- 計(jì)算機(jī)網(wǎng)絡(luò)技術(shù)實(shí)訓(xùn)
- 計(jì)算機(jī)網(wǎng)絡(luò)技術(shù)基礎(chǔ)
- 21天學(xué)通Java
- 樂(lè)高機(jī)器人—槍械武器庫(kù)
- MATLAB/Simulink權(quán)威指南:開(kāi)發(fā)環(huán)境、程序設(shè)計(jì)、系統(tǒng)仿真與案例實(shí)戰(zhàn)
- Windows Server 2008 R2活動(dòng)目錄內(nèi)幕
- MCGS嵌入版組態(tài)軟件應(yīng)用教程
- TensorFlow Deep Learning Projects