Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample Data and Reset Data Scripts #3285

Conversation

JaiPannu-IITI
Copy link

@JaiPannu-IITI JaiPannu-IITI commented Feb 23, 2025

Overview

Added 2 clean scripts with fail check implementation and vigorous error handling.

Closes: #3214

1. pnpm add:sample_data

This script is designed to handle adding sample data to already existing or formatted database and sample post_attachments to already existing or formatted Minio bucket without fail. Duplicates are handled with ease. Already existing data is not affected.

This script is non-interactive and additive in nature (non-destructive)

Scope: Available in all environments ( Dev + Prod + Demo )

Use case : This script can be used in any cron job without worrying about state of database.

Output

image

2. pnpm reset:data

This script completely formats database and minio bucket, administrator access is re-established.

This script is interactive and destructive in nature.

Scope: Available in all environment ( Dev + Prod + Demo ) but not recommended in production unless needed on a serious note.

Use case: This script should be used manually with care when needed on a serious note.

Output

image

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Introduced separate commands for resetting and adding sample data, streamlining database management.
    • Enhanced sample attachments with a modern image format and clearer, more descriptive naming.
    • Added a new configuration option to better support object storage services.
  • Chores

    • Removed outdated dependencies and consolidated data management workflows for improved system efficiency.
    • Deleted the previous sample data loading script to simplify the data management process.

Copy link

coderabbitai bot commented Feb 23, 2025

Walkthrough

This pull request restructures how sample data is managed. The dependency list and scripts in package.json have been updated to remove unused libraries and replace a single data import command with two discrete scripts for resetting and adding sample data. New TypeScript files have been introduced to handle these operations and to provide helper functions for PostgreSQL and MinIO integrations, while an obsolete data loading file was removed. Additionally, a new environment configuration property has been added and sample attachment files have been updated.

Changes

File(s) Summary of Changes
package.json Removed dependencies (dotenv, inquirer, @types/inquirer, uuid), removed "import:sample-data" script; added "reset:data" and "add:sample_data" scripts and retained "uuidv7".
scripts/dbManagement/addSampleData.ts New file providing functionality to add sample data to the database with error handling and controlled execution flow.
scripts/dbManagement/resetData.ts New file for resetting the database, emptying the MinIO bucket, and verifying administrator existence with confirmation and error handling.
scripts/dbManagement/helpers.ts New helper module containing utility functions for PostgreSQL and MinIO operations including database formatting, data insertion, file management, and connection handling.
scripts/dbManagement/sample_data/post_attachments.json Updated attachment entries: changed mimeType from "image/jpeg" to "image/webp" and updated name fields to more descriptive values.
src/envConfigSchema.ts Added a new property, MINIO_ROOT_USER, along with a comment referencing MinIO client documentation.
src/utilities/loadSampleData.ts Removed obsolete file previously handling sample data loading operations.

Possibly related PRs

  • fixed loadSampleData.ts and added tests  #2807: The changes in the main PR are related to the modifications in the retrieved PR, as both involve the management of sample data, specifically through the addition of new scripts and the restructuring of existing functionality in the loadSampleData.ts file.
  • Sample Data and Reset Data Scripts #3285: The changes in the main PR are directly related to those in the retrieved PR, as both involve modifications to the package.json file and the introduction of the addSampleData.ts and resetData.ts scripts, which share similar functionalities and structure.
  • fixed : #2841 NoMongo: Create an equivalent sample database for importation by the setup script  #3118: The changes in the main PR, which involve the removal of the loadSampleData.ts file and the introduction of new scripts for managing sample data, are directly related to the modifications in the retrieved PR that also focuses on managing sample data through the loadSampleData.ts file, indicating a shift in how sample data is handled in the project.

Suggested labels

ignore-sensitive-files-pr

Suggested reviewers

  • palisadoes
  • varshith257
✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@JaiPannu-IITI JaiPannu-IITI marked this pull request as ready for review February 23, 2025 09:59
Copy link

Our Pull Request Approval Process

Thanks for contributing!

Testing Your Code

Remember, your PRs won't be reviewed until these criteria are met:

  1. We don't merge PRs with poor code quality.
    1. Follow coding best practices such that CodeRabbit.ai approves your PR.
  2. We don't merge PRs with failed tests.
    1. When tests fail, click on the Details link to learn more.
    2. Write sufficient tests for your changes (CodeCov Patch Test). Your testing level must be better than the target threshold of the repository
    3. Tests may fail if you edit sensitive files. Ask to add the ignore-sensitive-files-pr label if the edits are necessary.
  3. We cannot merge PRs with conflicting files. These must be fixed.

Our policies make our code better.

Reviewers

Do not assign reviewers. Our Queue Monitors will review your PR and assign them.
When your PR has been assigned reviewers contact them to get your code reviewed and approved via:

  1. comments in this PR or
  2. our slack channel

Reviewing Your Code

Your reviewer(s) will have the following roles:

  1. arbitrators of future discussions with other contributors about the validity of your changes
  2. point of contact for evaluating the validity of your work
  3. person who verifies matching issues by others that should be closed.
  4. person who gives general guidance in fixing your tests

CONTRIBUTING.md

Read our CONTRIBUTING.md file. Most importantly:

  1. PRs with issues not assigned to you will be closed by the reviewer
  2. Fix the first comment in the PR so that each issue listed automatically closes

Other

  1. 🎯 Please be considerate of our volunteers' time. Contacting the person who assigned the reviewers is not advised unless they ask for your input. Do not @ the person who did the assignment otherwise.
  2. Read the CONTRIBUTING.md file make

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 381c644 and 2ae1db8.

⛔ Files ignored due to path filters (6)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
  • sample_data/images/01JMQ3EABXAD1F3KTW1NQGF99C.jpeg is excluded by !**/*.jpeg
  • sample_data/images/01JMQ3F1B55K272B25V19781D0.jpeg is excluded by !**/*.jpeg
  • sample_data/images/01JMQ3F9E5C806CN6M8EV31E01.jpeg is excluded by !**/*.jpeg
  • sample_data/images/01JMQ3FGBEDRB9AM6XP9C97RF8.jpeg is excluded by !**/*.jpeg
  • sample_data/images/01JMQ3FPS789Z1J6BBQ01YDHVQ.jpeg is excluded by !**/*.jpeg
📒 Files selected for processing (7)
  • package.json (1 hunks)
  • scripts/dbManagement/addSampleData.ts (1 hunks)
  • scripts/dbManagement/helpers.ts (1 hunks)
  • scripts/dbManagement/resetData.ts (1 hunks)
  • scripts/dbManagement/sample_data/post_attachments.json (1 hunks)
  • src/envConfigSchema.ts (1 hunks)
  • src/utilities/loadSampleData.ts (0 hunks)
💤 Files with no reviewable changes (1)
  • src/utilities/loadSampleData.ts
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Analyse Code With CodeQL (typescript)
🔇 Additional comments (8)
scripts/dbManagement/helpers.ts (4)

53-65: Interactive confirmation may fail in non-interactive/CI environments
askUserToContinue uses readline and expects a TTY. If run in a script or pipeline without a terminal, it may hang or fail. You might want to provide a fallback mechanism or skip prompts in non-interactive contexts.


94-137: Potential security concern when resetting user roles
ensureAdministratorExists forces role escalation if the user already exists but is not an administrator. Confirm that’s the desired behavior. Additionally, storing admin password in an environment variable could be risky in certain deployments.


529-532: Null dates may conflict with table constraints
parseDate returns null on invalid dates, potentially conflicting with not-null fields. If the schema prohibits null, consider raising an error or using a fallback value to avoid runtime exceptions.


584-593: Single-client usage in disconnect
disconnect calls queryClient.end() unconditionally. If multiple scripts or processes share the same client, this may cause unexpected resource release. Ensure that each script has its own client or carefully coordinates usage.

scripts/dbManagement/addSampleData.ts (1)

63-83: Correct exit code usage
If an error occurs, the script sets exitCode = 1 and continues gracefully. This is good practice. Just ensure unhandled rejections do not override this logic.

scripts/dbManagement/resetData.ts (2)

12-60: Guard statements after formatting failure
While you do a partial rollback if formatDatabase fails, the subsequent steps (like emptyMinioBucket) still occur. If that’s unintentional, you could add more checks or early returns upon severe errors.


62-87: Solid pattern for graceful shutdown
Running main in an IIFE and then disconnecting ensures a clean exit. The exit code is updated upon errors, which improves script reliability.

scripts/dbManagement/sample_data/post_attachments.json (1)

7-8: Great improvements to image format and naming!

The changes bring several benefits:

  1. WebP format offers better compression and quality compared to JPEG.
  2. Descriptive naming using DALL·E prefix clearly indicates AI-generated content.

Also applies to: 15-16, 23-24, 31-32, 39-40, 47-48

Copy link

codecov bot commented Feb 23, 2025

Codecov Report

Attention: Patch coverage is 20.09646% with 497 lines in your changes missing coverage. Please review.

Project coverage is 48.36%. Comparing base (fc58125) to head (f8e9bc7).
Report is 1 commits behind head on develop-postgres.

Files with missing lines Patch % Lines
scripts/dbManagement/helpers.ts 6.18% 455 Missing ⚠️
scripts/dbManagement/addSampleData.ts 66.12% 21 Missing ⚠️
scripts/dbManagement/resetData.ts 71.62% 21 Missing ⚠️
Additional details and impacted files
@@                 Coverage Diff                  @@
##           develop-postgres    #3285      +/-   ##
====================================================
- Coverage             48.41%   48.36%   -0.05%     
====================================================
  Files                   456      458       +2     
  Lines                 34225    34516     +291     
  Branches                937      966      +29     
====================================================
+ Hits                  16570    16695     +125     
- Misses                17655    17821     +166     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@JaiPannu-IITI
Copy link
Author

JaiPannu-IITI commented Feb 23, 2025

Adding tests for the final scripts..... Will inform when PR is ready.

coderabbitai[bot]
coderabbitai bot previously approved these changes Feb 23, 2025
@Cioppolo14
Copy link
Contributor

There isn’t an issue assigned to you for this PR. Please follow the guidelines in our PR_GUIDELINES.md file. We have the procedures in place so that everyone has a fair chance of contributing. I will be closing this pull request. Please follow the procedures and resubmit when ready.

@Cioppolo14 Cioppolo14 closed this Feb 23, 2025
@JaiPannu-IITI
Copy link
Author

@Cioppolo14 I am assigned #3214

@JaiPannu-IITI
Copy link
Author

JaiPannu-IITI commented Feb 23, 2025

Title is different as during development our goals got changed a little bit

@JaiPannu-IITI
Copy link
Author

JaiPannu-IITI commented Feb 23, 2025

#3277
#3223
#3278
#3264
#3227

These are previous discussions to solve the issue.

@JaiPannu-IITI
Copy link
Author

Can this be reopened or should I raise another PR?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 21b720e and a848757.

📒 Files selected for processing (3)
  • scripts/dbManagement/helpers.ts (1 hunks)
  • test/scripts/dbManagement/addSampleData.test.ts (1 hunks)
  • test/scripts/dbManagement/resetDB.test.ts (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
scripts/dbManagement/helpers.ts (1)
Learnt from: JaiPannu-IITI
PR: PalisadoesFoundation/talawa-api#3285
File: scripts/dbManagement/helpers.ts:70-93
Timestamp: 2025-02-23T10:18:23.438Z
Learning: The database management utility functions in `scripts/dbManagement/helpers.ts` are designed to be used only within their specific scripts due to their destructive nature. Error handling is intentionally handled at the script level rather than within individual utility functions.
🪛 GitHub Actions: Pull request workflow
test/scripts/dbManagement/resetDB.test.ts

[error] 1-1: Import statements could be sorted.


[error] 1-1: Formatter would have printed the following content: Code formatting issues detected.

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Analyse Code With CodeQL (typescript)
  • GitHub Check: Analyse Code With CodeQL (typescript)
🔇 Additional comments (12)
scripts/dbManagement/helpers.ts (9)

1-19: Great modular design and clarity in imports.

The file establishes essential utilities and demonstrates well-organized imports. Overall, this structure promotes maintainability and readability.


28-28: Potential mismatch for bucketName.

Currently, bucketName is set to envConfig.MINIO_ROOT_USER. This environment variable name typically implies a username credential (e.g., a user or an access key) rather than a bucket name. Consider confirming that this is indeed intended or rename the variable to avoid confusion.


70-93: Limited error reporting when formatting the database.

This function only returns false upon error, providing minimal context. If debugging proves difficult, consider rethrowing or logging a more descriptive error to help identify issues more quickly.


141-174: Check whether the bucket exists before removing objects.

If the bucket name is misconfigured or the bucket does not exist, listObjects can fail unexpectedly. Logging “bucket not found” or verifying existence can help diagnose configuration errors at an earlier stage.


179-209: Ensure more robust JSON parse error handling.

If any file is malformed, the function provides a generic error message. You may consider catching the parsing error separately and identifying which file is invalid for easier troubleshooting.


215-222: Return diagnostic info in pingDB.

Currently, the function only tries a simple query. Providing more descriptive logs or returning additional diagnostic information could simplify debugging if a database connection fails.


228-250: Log partial insert details in checkAndInsertData.

Since duplicates are silently ignored via onConflictDoNothing, you could log how many rows were inserted versus ignored. This extra transparency can assist in monitoring data updates.


258-524: Reduce duplication in insertCollections.

Repeated logic for reading JSON, parsing dates, and inserting data is present for each collection. Centralizing this in a shared helper or a dynamic mapping approach can reduce code repetition and enhance maintainability.


540-584: Refine error handling in checkDataSize.

If an error occurs, the function prints it but returns false. In critical production scenarios, consider throwing the error or providing optional partial results to help with troubleshooting.

test/scripts/dbManagement/addSampleData.test.ts (2)

10-50: Comprehensive test coverage for successful pathways.

This set of tests efficiently mocks critical helper functions and verifies the main flow, ensuring all steps (database ping, admin creation, sample data insertion) are executed and logged as expected. Excellent use of vitest spies to capture logs.


77-88: Valid error path testing logic.

The test ensures that when insertCollections fails, the function surfaces an appropriate error message. This robust error handling check helps maintain reliability of the data import process.

test/scripts/dbManagement/resetDB.test.ts (1)

6-106: Thorough test coverage for destructive reset operations.

These tests successfully capture user confirmation flow, database connectivity, destructive formatting, bucket cleanup, and administrator creation. The suite comprehensively verifies both failure scenarios and success paths without extraneous complexity.

@palisadoes
Copy link
Contributor

Please merge with the latest upstream

@JaiPannu-IITI
Copy link
Author

Sure sir,
Also I am still working on code,
I'll tag you once it's done

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a848757 and b3d9160.

📒 Files selected for processing (1)
  • test/scripts/dbManagement/resetDB.test.ts (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Run tests for talawa api
  • GitHub Check: Analyse Code With CodeQL (typescript)
🔇 Additional comments (5)
test/scripts/dbManagement/resetDB.test.ts (5)

1-4: Fix import ordering.

The import statements need to be reordered to comply with the pipeline requirements.

Apply this diff to fix the import order:

-import * as helpers from "scripts/dbManagement/helpers";
-import * as resetDataModule from "scripts/dbManagement/resetData";
-// tests/dbManagement/resetData.test.ts
-import { beforeEach, expect, suite, test, vi } from "vitest";
+import { beforeEach, expect, suite, test, vi } from "vitest";
+import * as helpers from "scripts/dbManagement/helpers";
+import * as resetDataModule from "scripts/dbManagement/resetData";

6-10: LGTM! Good test suite setup.

The test suite follows best practices by resetting all mocks before each test using beforeEach.


12-22: LGTM! Well-structured test case.

The test follows the AAA pattern and effectively validates the cancellation scenario.


24-33: LGTM! Good error handling test.

The test effectively validates the error handling for database connection failures.


89-116: LGTM! Comprehensive success scenario test.

The test effectively validates all success messages when operations complete successfully.

@JaiPannu-IITI
Copy link
Author

JaiPannu-IITI commented Feb 23, 2025

@palisadoes PR is ready for review.

  1. Scripts are ready
  2. Changed sample data images for minio bucket from old stock images to DALL.E generated to avoid copyright claims against talawa.
  3. Upgraded reset:db to reset:data which formats database and buckets both to clear of unwanted graphical content from demo at ease.
  4. Tests for main files ( addSampleData.ts and resetData.ts ) has been added.

Test for helper.ts (remaining 28% patch) will be added once we come up with decision regarding pg-mem since these interactive tests can't be performed with others right now as they are destructive in nature.

Please don't close issue, I will raise follow-up for this PR with more test coverage with pg-mem installation and integration (if agreed) along with Docker shell solution to run these scripts in one line CLI.

Meanwhile this can be merged to avoid conflicts like last time as this have some folder rearrangements but is ready for use and tested.

@palisadoes palisadoes merged commit 0aa74c2 into PalisadoesFoundation:develop-postgres Feb 23, 2025
14 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants