Skip to content

Commit

Permalink
Merge pull request #2 from hhow09/doc
Browse files Browse the repository at this point in the history
doc: add mongo data modeling choice in document
  • Loading branch information
hhow09 authored Jan 27, 2025
2 parents 4fd6436 + 0366fcc commit f88186e
Showing 1 changed file with 19 additions and 0 deletions.
19 changes: 19 additions & 0 deletions backend/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ npm run test
- `history`: get the latest 10 commands and results.
- [x] [Express](https://expressjs.com/) is used for the server and router.
- [x] [Socket.io](https://socket.io/) is used for WebSocket communication.
- [socket.id](https://socket.io/docs/v4/server-socket-instance/#socketid) is used as **client identifier**.
- [x] [MongoDB](https://www.mongodb.com/) is used for storing chat history.
- [Array $slice](https://www.mongodb.com/docs/manual/reference/operator/update/slice/) is used to keep only the latest history.
- `Index` is created on `clientId` for history command querying by `clientId`.
Expand All @@ -48,6 +49,24 @@ src
└── index.ts
```

## MongoDB Data Modeling
- Based on the requirement of `history` command (the only read operation), the data needed to be persisted is `clientId`, `operation`, and `result`.
- The requirement also mentioned that **only the latest 10 operations** need to be persisted.
- There are 2 possible ways to implement this:
1. each **operation** per document. e.g. `{ clientId: '1', operation: '1 + 1', result: '2', timestamp: '2025-01-25T00:00:00.000Z' }`
2. store chat history in a single document as a **chat session**. e.g. `{ clientId: '1', history: [{ operation: '1 + 1', result: '2' }, { operation: '1 + 1', result: '2' }, ...] }`.
- For the 1st approach,
- read operation is straightforward: query by `clientId` and sort by `timestamp` & limit to 10
- write operation is also straightforward: [insertOne](https://www.mongodb.com/docs/manual/reference/method/db.collection.insertOne/)
- however, additional cleanup logic is needed to remove old operations (e.g. cron job or [TTL Indexes](https://www.mongodb.com/docs/manual/core/index-ttl/))
- moreover, mongodb document also suggests that **You should consider embedding for performance reasons if you have a collection with a large number of small documents.** (ref: [Collection Contains Large Number of Small Documents](https://www.mongodb.com/docs/manual/core/data-model-operations/#collection-contains-large-number-of-small-documents))
- For the 2nd approach,
- also known as [Bucket Pattern](https://www.mongodb.com/docs/manual/data-modeling/design-patterns/group-data/bucket-pattern/)
- We need to [Avoid Unbounded Arrays](https://www.mongodb.com/docs/manual/data-modeling/design-antipatterns/unbounded-arrays/) to grow over document size limit 16MB. However, only keeping the latest 10 operations should not be an issue.
- write operation: there is out-of-box solution of [array push](https://www.mongodb.com/docs/manual/reference/operator/update/push/#mongodb-update-up.-push) with [$slice](https://www.mongodb.com/docs/manual/reference/operator/update/slice/) option in single operation.
- read operation is also straightforward: simply query by `clientId`.
- In summary, the 2nd approach is more suitable for this case.

## Math Calculation
### Evalutaion Algorithm
1. Parse the expression string by splitting by `+` and `-`, resulting a list of sub-expressions (`Summand`).
Expand Down

0 comments on commit f88186e

Please sign in to comment.