From b7104562f07b345450719a9cdcd07d99dd16586b Mon Sep 17 00:00:00 2001 From: eecavanna Date: Sun, 23 Feb 2025 22:32:22 -0800 Subject: [PATCH] Optimize migration notebook to only dump/restore relevant collections --- db/migrations/notebooks/.notebook.env.example | 4 +- .../notebooks/migrate_11_3_0_to_11_4_0.ipynb | 371 ++++++++---------- 2 files changed, 162 insertions(+), 213 deletions(-) diff --git a/db/migrations/notebooks/.notebook.env.example b/db/migrations/notebooks/.notebook.env.example index 91113f1c..c8fef2f6 100644 --- a/db/migrations/notebooks/.notebook.env.example +++ b/db/migrations/notebooks/.notebook.env.example @@ -1,4 +1,4 @@ -# Paths to folders in which the notebook will store Mongo dumps. +# Paths to existing folders in which the notebook will store Mongo dumps. PATH_TO_ORIGIN_MONGO_DUMP_FOLDER = "./mongodump.origin.out" PATH_TO_TRANSFORMER_MONGO_DUMP_FOLDER = "./mongodump.transformer.out" @@ -12,7 +12,7 @@ PATH_TO_MONGOSH_BINARY = "__REPLACE_ME__" # e.g. "/Users/Alice/Downloads/mongos # to migrate. It is where the "E" and "L" in "ETL" will take place. ORIGIN_MONGO_HOST="__REPLACE_ME__" ORIGIN_MONGO_PORT="__REPLACE_ME__" -ORIGIN_MONGO_USERNAME="__REPLACE_ME__" +ORIGIN_MONGO_USERNAME="__REPLACE_ME__" # e.g. "app.migrator" ORIGIN_MONGO_PASSWORD="__REPLACE_ME__" ORIGIN_MONGO_DATABASE_NAME="__REPLACE_ME__" # e.g. "nmdc" diff --git a/db/migrations/notebooks/migrate_11_3_0_to_11_4_0.ipynb b/db/migrations/notebooks/migrate_11_3_0_to_11_4_0.ipynb index 244b366c..4ad36ca8 100644 --- a/db/migrations/notebooks/migrate_11_3_0_to_11_4_0.ipynb +++ b/db/migrations/notebooks/migrate_11_3_0_to_11_4_0.ipynb @@ -1,52 +1,46 @@ { "cells": [ { + "metadata": {}, "cell_type": "markdown", - "id": "initial_id", - "metadata": { - "collapsed": true, - "jupyter": { - "outputs_hidden": true - } - }, - "source": "# Migrate MongoDB database from `nmdc-schema` `v11.3.0` to `v11.4.0`" + "source": "# Migrate MongoDB database from `nmdc-schema` `v11.3.0` to `v11.4.0`", + "id": "86266489462d10d5" }, { - "cell_type": "markdown", - "id": "3c31d85d", "metadata": {}, + "cell_type": "markdown", "source": [ "## Introduction\n", "\n", "This notebook will be used to migrate the database from `nmdc-schema` `v11.3.0` ([released](https://github.com/microbiomedata/nmdc-schema/releases/tag/v11.3.0) January 17, 2025) to `v11.4.0` ([released](https://github.com/microbiomedata/nmdc-schema/releases/tag/v11.4.0) February 12, 2025).\n", "\n", - "### Heads up\n", + "### Notice\n", + "\n", + "In each migration notebook between schema `v10.9.1` and `v11.3.0`, we dumped **all collections** from the Mongo database. We started doing that once migrations involved collection-level operations (i.e., creating, renaming, and deleting them), as opposed to only document-level operations.\n", "\n", - "Unlike some previous migrators, this one does not \"pick and choose\" which collections it will dump. There are two reasons for this: (1) migrators no longer have a dedicated `self.agenda` dictionary that indicates all the collections involved in the migration; and (2) migrators can now create, rename, and drop collections; none of which are things that the old `self.agenda`-based system was designed to handle. So, instead of picking and choosing collections, this migrator **dumps them all.**" - ] + "In _this_ migration notebook (from schema `v11.3.0` to `v11.4.0`), we dump only **one collection** from the Mongo database. We opted to do this after understanding the scope of the `Migrator` class ([here](https://github.com/microbiomedata/nmdc-schema/blob/main/nmdc_schema/migrators/migrator_from_11_3_0_to_11_4_0.py)) imported by this notebook. This eliminates some overhead from the migration process." + ], + "id": "81809c54b43ee383" }, { - "cell_type": "markdown", - "id": "f65ad4ab", "metadata": {}, - "source": [ - "## Prerequisites" - ] + "cell_type": "markdown", + "source": "## Prerequisites", + "id": "dffc8fac414e6b8a" }, { - "cell_type": "markdown", - "id": "17f351e8", "metadata": {}, + "cell_type": "markdown", "source": [ "### 1. Coordinate with stakeholders.\n", "\n", "We will be enacting full Runtime and Database downtime for this migration. Ensure stakeholders are aware of that." - ] + ], + "id": "f033049a3dd6d9d" }, { - "cell_type": "markdown", - "id": "233a35c3", "metadata": {}, + "cell_type": "markdown", "source": [ "### 2. Set up notebook environment.\n", "\n", @@ -54,37 +48,35 @@ "\n", "1. Start a **MongoDB server** on your local machine (and ensure it does **not** already contain a database having the name specified in the notebook configuration file).\n", " 1. You can start a [Docker](https://hub.docker.com/_/mongo)-based MongoDB server at `localhost:27055` by running the following command. A MongoDB server started this way will be accessible without a username or password.\n" - ] + ], + "id": "8be6e542187f9a62" }, { - "cell_type": "code", - "id": "8aee55e3", "metadata": {}, - "source": "!docker run --rm --detach --name mongo-migration-transformer -p 27055:27017 mongo:7.0.15", + "cell_type": "code", + "source": "!docker run --rm --detach --name mongo-migration-transformer -p 27055:27017 mongo:8.0.4", + "id": "14b2b2c65f62c309", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "6cd05ccb", "metadata": {}, + "cell_type": "markdown", "source": [ "2. Create and populate a **notebook configuration file** named `.notebook.env`.\n", " > You can use `.notebook.env.example` as a template." - ] + ], + "id": "d516ac8a1c15c660" }, { - "cell_type": "markdown", - "id": "69937b18", "metadata": {}, - "source": [ - "## Procedure" - ] + "cell_type": "markdown", + "source": "## Procedure", + "id": "48d25f7b40726502" }, { - "cell_type": "markdown", - "id": "fe81196a", "metadata": {}, + "cell_type": "markdown", "source": [ "### Install Python packages\n", "\n", @@ -98,30 +90,24 @@ "|---------------------------------------------------------------------------------|--------------------------------------------------------|\n", "| NMDC Schema PyPI package | https://pypi.org/project/nmdc-schema |\n", "| How to `pip install` from a Git branch
instead of PyPI | https://stackoverflow.com/a/20101940 |" - ] + ], + "id": "e025d76e8fcdc5bf" }, { + "metadata": {}, "cell_type": "code", - "id": "e25a0af308c3185b", - "metadata": { - "collapsed": false, - "jupyter": { - "outputs_hidden": false - }, - "scrolled": true - }, "source": [ "%pip install --upgrade pip\n", "%pip install -r requirements.txt\n", "%pip install nmdc-schema==11.4.0" ], + "id": "efcd532dd6184f23", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "a407c354", "metadata": {}, + "cell_type": "markdown", "source": [ "### Import Python dependencies\n", "\n", @@ -133,20 +119,20 @@ "|----------------------------------------|-------------------------------------------------------------------------------------------------------|\n", "| Dynamically importing a Python module | [`importlib.import_module`](https://docs.python.org/3/library/importlib.html#importlib.import_module) |\n", "| Confirming something is a Python class | [`inspect.isclass`](https://docs.python.org/3/library/inspect.html#inspect.isclass) |" - ] + ], + "id": "b52d1ff0e0d98472" }, { - "cell_type": "code", - "id": "9e8a3ceb", "metadata": {}, + "cell_type": "code", "source": "MIGRATOR_MODULE_NAME = \"migrator_from_11_3_0_to_11_4_0\"", + "id": "11c5d669a24f0ae6", "outputs": [], "execution_count": null }, { - "cell_type": "code", - "id": "dbecd561", "metadata": {}, + "cell_type": "code", "source": [ "# Standard library packages:\n", "import subprocess\n", @@ -171,23 +157,23 @@ "Migrator = getattr(migrator_module, \"Migrator\") # gets the class\n", "assert isclass(Migrator), \"Failed to import Migrator class.\"" ], + "id": "8d443a1bbc8e3701", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "99b20ff4", "metadata": {}, + "cell_type": "markdown", "source": [ "### Parse configuration files\n", "\n", "Parse the notebook and Mongo configuration files." - ] + ], + "id": "b496f19c849e733d" }, { - "cell_type": "code", - "id": "1eac645a", "metadata": {}, + "cell_type": "code", "source": [ "cfg = Config()\n", "\n", @@ -215,33 +201,33 @@ "!{mongorestore} --version\n", "!{mongosh} --version" ], + "id": "7ba2424242fb74e4", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "68245d2b", "metadata": {}, + "cell_type": "markdown", "source": [ "### Create MongoDB clients\n", "\n", "Create MongoDB clients you can use to access the \"origin\" and \"transformer\" MongoDB servers." - ] + ], + "id": "6ab8f0802ebed34" }, { - "cell_type": "code", - "id": "8e95f559", "metadata": {}, + "cell_type": "code", "source": [ "# Mongo client for \"origin\" MongoDB server.\n", - "origin_mongo_client = pymongo.MongoClient(host=cfg.origin_mongo_host, \n", + "origin_mongo_client = pymongo.MongoClient(host=cfg.origin_mongo_host,\n", " port=int(cfg.origin_mongo_port),\n", " username=cfg.origin_mongo_username,\n", " password=cfg.origin_mongo_password,\n", " directConnection=True)\n", "\n", "# Mongo client for \"transformer\" MongoDB server.\n", - "transformer_mongo_client = pymongo.MongoClient(host=cfg.transformer_mongo_host, \n", + "transformer_mongo_client = pymongo.MongoClient(host=cfg.transformer_mongo_host,\n", " port=int(cfg.transformer_mongo_port),\n", " username=cfg.transformer_mongo_username,\n", " password=cfg.transformer_mongo_password,\n", @@ -261,13 +247,13 @@ " # Sanity test: Ensure the transformation database does not exist.\n", " assert cfg.transformer_mongo_database_name not in transformer_mongo_client.list_database_names(), \"Transformation database already exists.\"" ], + "id": "a883186068ed590d", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "1e195db1", "metadata": {}, + "cell_type": "markdown", "source": [ "Delete the transformer database from the transformer MongoDB server if that database already exists there (e.g. if it was left over from an experiment).\n", "\n", @@ -277,12 +263,12 @@ "|------------------------------|---------------------------------------------------------------|\n", "| Python's `subprocess` module | https://docs.python.org/3/library/subprocess.html |\n", "| `mongosh` CLI options | https://www.mongodb.com/docs/mongodb-shell/reference/options/ |" - ] + ], + "id": "846e97bb0b6544ae" }, { - "cell_type": "code", - "id": "8939a2ed", "metadata": {}, + "cell_type": "code", "source": [ "# Note: I run this command via Python's `subprocess` module instead of via an IPython magic `!` command\n", "# because I expect to eventually use regular Python scripts—not Python notebooks—for migrations.\n", @@ -295,18 +281,13 @@ "completed_process = subprocess.run(shell_command, shell=True)\n", "print(f\"\\nReturn code: {completed_process.returncode}\")" ], + "id": "331093f3ce6c50a8", "outputs": [], "execution_count": null }, { + "metadata": {}, "cell_type": "markdown", - "id": "bc387abc62686091", - "metadata": { - "collapsed": false, - "jupyter": { - "outputs_hidden": false - } - }, "source": [ "### Create validator\n", "\n", @@ -318,17 +299,12 @@ "|------------------------------|------------------------------------------------------------------------------|\n", "| LinkML's `Validator` class | https://linkml.io/linkml/code/validator.html#linkml.validator.Validator |\n", "| Validating data using LinkML | https://linkml.io/linkml/data/validating-data.html#validation-in-python-code |" - ] + ], + "id": "8839f276a4402775" }, { + "metadata": {}, "cell_type": "code", - "id": "5c982eb0c04e606d", - "metadata": { - "collapsed": false, - "jupyter": { - "outputs_hidden": false - } - }, "source": [ "schema_definition = get_nmdc_schema_definition()\n", "validator = Validator(\n", @@ -339,42 +315,42 @@ "# Perform a sanity test of the validator.\n", "assert callable(validator.validate), \"Failed to instantiate a validator\"" ], + "id": "cd01c9f52db7f6e1", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "e7e8befb362a1670", "metadata": {}, + "cell_type": "markdown", "source": [ "### Create SchemaView\n", "\n", - "In this step, you'll instantiate a `SchemaView` that is bound to the destination schema. \n", + "In this step, you'll instantiate a `SchemaView` that is bound to the destination schema.\n", "\n", "#### References\n", "\n", "| Description | Link |\n", "|-----------------------------|-----------------------------------------------------|\n", "| LinkML's `SchemaView` class | https://linkml.io/linkml/developers/schemaview.html |" - ] + ], + "id": "7d45f8a5d3aa9f3" }, { - "cell_type": "code", - "id": "625a6e7df5016677", "metadata": {}, + "cell_type": "code", "source": [ "schema_view = SchemaView(get_nmdc_schema_definition())\n", "\n", "# As a sanity test, confirm we can use the `SchemaView` instance to access a schema class.\n", "schema_view.get_class(class_name=\"Database\")[\"name\"]" ], + "id": "1778be5cd7b68ad2", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "3975ac24", "metadata": {}, + "cell_type": "markdown", "source": [ "### Revoke access from the \"origin\" MongoDB server\n", "\n", @@ -391,12 +367,12 @@ "| Description | Link |\n", "|--------------------------------|-----------------------------------------------------------|\n", "| Running a script via `mongosh` | https://www.mongodb.com/docs/mongodb-shell/write-scripts/ |" - ] + ], + "id": "2413c292652103a2" }, { - "cell_type": "code", - "id": "f761caad", "metadata": {}, + "cell_type": "code", "source": [ "shell_command = f\"\"\"\n", " {cfg.mongosh_path} {origin_mongo_cli_base_options} \\\n", @@ -406,82 +382,77 @@ "completed_process = subprocess.run(shell_command, shell=True)\n", "print(f\"\\nReturn code: {completed_process.returncode}\")" ], + "id": "7eef0f264af138f", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "7f9c87de6fb8530c", "metadata": {}, + "cell_type": "markdown", "source": [ "### Delete obsolete dumps from previous migrations\n", "\n", "Delete any existing dumps before we create new ones in this notebook. This is so the dumps you generate with this notebook do not get merged with any unrelated ones." - ] + ], + "id": "1295c7a43cfbd083" }, { - "cell_type": "code", - "id": "6a949d0fcb4b6fa0", "metadata": {}, + "cell_type": "code", "source": [ "!rm -rf {cfg.origin_dump_folder_path}\n", "!rm -rf {cfg.transformer_dump_folder_path}" ], + "id": "d5d721e946414514", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "b7799910b6b0715d", "metadata": {}, + "cell_type": "markdown", "source": [ - "### Dump collections from the \"origin\" MongoDB server\n", + "### Dump collection(s) from the \"origin\" MongoDB server\n", "\n", - "Use `mongodump` to dump all the collections **from** the \"origin\" MongoDB server **into** a local directory.\n", - "\n", - "- TODO: Consider only dumping collections represented by the initial schema." - ] + "Use `mongodump` to dump specific collection(s) **from** the \"origin\" MongoDB server **into** a local directory.\n" + ], + "id": "cd224e2f5d6ccd5" }, { + "metadata": {}, "cell_type": "code", - "id": "da530d6754c4f6fe", - "metadata": { - "scrolled": true - }, "source": [ - "# Dump all collections from the \"origin\" database.\n", + "# Dump the specified collection from the \"origin\" database.\n", "shell_command = f\"\"\"\n", " {mongodump} {origin_mongo_cli_base_options} \\\n", " --db='{cfg.origin_mongo_database_name}' \\\n", " --out='{cfg.origin_dump_folder_path}' \\\n", - " --gzip\n", + " --gzip \\\n", + " --collection='workflow_execution_set'\n", "\"\"\"\n", "completed_process = subprocess.run(shell_command, shell=True)\n", "print(f\"\\nReturn code: {completed_process.returncode}\")" ], + "id": "725dcb33c1728521", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "932ebde8abdd70ec", "metadata": {}, + "cell_type": "markdown", "source": [ - "### Load the dumped collections into the \"transformer\" MongoDB server\n", + "### Load the dumped collection(s) into the \"transformer\" MongoDB server\n", "\n", - "Use `mongorestore` to load the dumped collections **from** the local directory **into** the \"transformer\" MongoDB server.\n", + "Use `mongorestore` to load the dumped collection(s) **from** the local directory **into** the \"transformer\" MongoDB server.\n", "\n", "References:\n", "- https://www.mongodb.com/docs/database-tools/mongorestore/#std-option-mongorestore\n", "- https://www.mongodb.com/docs/database-tools/mongorestore/mongorestore-examples/#copy-clone-a-database" - ] + ], + "id": "9a7865d1a9d31945" }, { + "metadata": {}, "cell_type": "code", - "id": "79bd888e82d52a93", - "metadata": { - "scrolled": true - }, "source": [ "# Restore the dumped collections to the \"transformer\" MongoDB server.\n", "shell_command = f\"\"\"\n", @@ -496,13 +467,13 @@ "completed_process = subprocess.run(shell_command, shell=True)\n", "print(f\"\\nReturn code: {completed_process.returncode}\")" ], + "id": "6c047360d46e09cb", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "c3e3c9c4", "metadata": {}, + "cell_type": "markdown", "source": [ "### Transform the collections within the \"transformer\" MongoDB server\n", "\n", @@ -511,14 +482,12 @@ "> Reminder: The database transformation functions are defined in the `nmdc-schema` Python package installed earlier.\n", "\n", "> Reminder: The \"origin\" database is **not** affected by this step." - ] + ], + "id": "1d3e516513c3b7a8" }, { + "metadata": {}, "cell_type": "code", - "id": "9c89c9dd3afe64e2", - "metadata": { - "scrolled": true - }, "source": [ "# Instantiate a MongoAdapter bound to the \"transformer\" database.\n", "adapter = MongoAdapter(\n", @@ -535,23 +504,23 @@ "# Execute the Migrator's `upgrade` method to perform the migration.\n", "migrator.upgrade()" ], + "id": "282941a1a07a94cd", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "4c090068", "metadata": {}, + "cell_type": "markdown", "source": [ "### Validate the transformed documents\n", "\n", "Now that we have transformed the database, validate each document in each collection in the \"transformer\" MongoDB server." - ] + ], + "id": "673e10ac5be8f3d8" }, { - "cell_type": "code", - "id": "e1c50b9911e02e70", "metadata": {}, + "cell_type": "code", "source": [ "# Get the names of all collections.\n", "collection_names: List[str] = get_collection_names_from_schema(schema_view)\n", @@ -574,7 +543,7 @@ " # Calculate how often we'll display a tick mark (i.e. a sign of life).\n", " num_documents_per_tick = num_documents_in_collection * 0.10 # one tenth of the total\n", " num_documents_since_last_tick = 0\n", - " \n", + "\n", " for document in collection.find():\n", " # Validate the transformed document.\n", " #\n", @@ -600,28 +569,26 @@ " if num_documents_since_last_tick >= num_documents_per_tick:\n", " num_documents_since_last_tick = 0\n", " print(\".\", end=\"\") # no newline\n", - " \n", + "\n", " print(\"]\")" ], + "id": "b70d83efc93ba0e8", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "3edf77c7", "metadata": {}, + "cell_type": "markdown", "source": [ "### Dump the collections from the \"transformer\" MongoDB server\n", "\n", "Now that the collections have been transformed and validated, dump them **from** the \"transformer\" MongoDB server **into** a local directory." - ] + ], + "id": "992778323e5abf6a" }, { + "metadata": {}, "cell_type": "code", - "id": "db6e432d", - "metadata": { - "scrolled": true - }, "source": [ "# Dump the database from the \"transformer\" MongoDB server.\n", "shell_command = f\"\"\"\n", @@ -631,98 +598,91 @@ " --gzip\n", "\"\"\"\n", "completed_process = subprocess.run(shell_command, shell=True)\n", - "print(f\"\\nReturn code: {completed_process.returncode}\") " + "print(f\"\\nReturn code: {completed_process.returncode}\")" ], + "id": "e05899950263caa8", "outputs": [], "execution_count": null }, { + "metadata": {}, "cell_type": "markdown", - "id": "997fcb281d9d3222", - "metadata": { - "collapsed": false, - "jupyter": { - "outputs_hidden": false - } - }, "source": [ "### Create a bookkeeper\n", "\n", "Create a `Bookkeeper` that can be used to document migration events in the \"origin\" server." - ] + ], + "id": "333a58b11c25631" }, { - "cell_type": "code", - "id": "dbbe706d", "metadata": {}, - "source": [ - "bookkeeper = Bookkeeper(mongo_client=origin_mongo_client)" - ], + "cell_type": "code", + "source": "bookkeeper = Bookkeeper(mongo_client=origin_mongo_client)", + "id": "b35d302475b22ea9", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "1e0c8891", "metadata": {}, + "cell_type": "markdown", "source": [ "### Indicate — on the \"origin\" server — that the migration is underway\n", "\n", "Add an entry to the migration log collection to indicate that this migration has started." - ] + ], + "id": "2e0866a47939597d" }, { - "cell_type": "code", - "id": "ca49f61a", "metadata": {}, - "source": [ - "bookkeeper.record_migration_event(migrator=migrator, event=MigrationEvent.MIGRATION_STARTED)" - ], + "cell_type": "code", + "source": "bookkeeper.record_migration_event(migrator=migrator, event=MigrationEvent.MIGRATION_STARTED)", + "id": "b08824e7b1f59c46", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "9c253e6f", "metadata": {}, + "cell_type": "markdown", "source": [ - "### Drop the original collections from the \"origin\" MongoDB server\n", + "### Skipped: Drop the original collections from the \"origin\" MongoDB server\n", + "\n", + "Note: This step is necessary for migrations where collections are being renamed or deleted. (The `--drop` option of `mongorestore` would only drop collections that exist in the dump being restored, which would not include renamed or deleted collections.)\n", "\n", - "This is necessary for situations where collections were renamed or deleted. (The `--drop` option of `mongorestore` would only drop collections that exist in the dump being restored, which would not include renamed or deleted collections.)" - ] + "In the case of _this_ migration, no collections are being renamed or deleted. So, we can skip this step. The `workflow_execution_set` collection that the migrator _did_ transform, will still be dropped when we run `mongorestore` with the `--drop` option later in this notebook.\n" + ], + "id": "2982d126803c6558" }, { - "cell_type": "code", - "id": "0b26e434", "metadata": {}, + "cell_type": "code", "source": [ - "shell_command = f\"\"\"\n", - " {cfg.mongosh_path} {origin_mongo_cli_base_options} \\\n", - " --eval 'use {cfg.origin_mongo_database_name}' \\\n", - " --eval 'db.dropDatabase()'\n", - "\"\"\"\n", - "completed_process = subprocess.run(shell_command, shell=True)\n", - "print(f\"\\nReturn code: {completed_process.returncode}\")" + "print(\"skipped\")\n", + "\n", + "# shell_command = f\"\"\"\n", + "# {cfg.mongosh_path} {origin_mongo_cli_base_options} \\\n", + "# --eval 'use {cfg.origin_mongo_database_name}' \\\n", + "# --eval 'db.dropDatabase()'\n", + "# \"\"\"\n", + "# completed_process = subprocess.run(shell_command, shell=True)\n", + "# print(f\"\\nReturn code: {completed_process.returncode}\")" ], + "id": "1b5a6f4e30f3d6f1", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "d84bdc11", "metadata": {}, + "cell_type": "markdown", "source": [ "### Load the collections into the \"origin\" MongoDB server\n", "\n", "Load the transformed collections into the \"origin\" MongoDB server." - ] + ], + "id": "8d2845a0322bfc8c" }, { + "metadata": {}, "cell_type": "code", - "id": "1dfbcf0a", - "metadata": { - "scrolled": true - }, "source": [ "# Load the transformed collections into the origin server, replacing any same-named ones that are there.\n", "shell_command = f\"\"\"\n", @@ -736,55 +696,43 @@ " --gzip\n", "\"\"\"\n", "completed_process = subprocess.run(shell_command, shell=True)\n", - "print(f\"\\nReturn code: {completed_process.returncode}\") " + "print(f\"\\nReturn code: {completed_process.returncode}\")" ], + "id": "248c443030832cdf", "outputs": [], "execution_count": null }, { + "metadata": {}, "cell_type": "markdown", - "id": "ca5ee89a79148499", - "metadata": { - "collapsed": false, - "jupyter": { - "outputs_hidden": false - } - }, "source": [ "### Indicate that the migration is complete\n", "\n", "Add an entry to the migration log collection to indicate that this migration is complete." - ] + ], + "id": "777204c62e37c908" }, { + "metadata": {}, "cell_type": "code", - "id": "d1eaa6c92789c4f3", - "metadata": { - "collapsed": false, - "jupyter": { - "outputs_hidden": false - } - }, - "source": [ - "bookkeeper.record_migration_event(migrator=migrator, event=MigrationEvent.MIGRATION_COMPLETED)" - ], + "source": "bookkeeper.record_migration_event(migrator=migrator, event=MigrationEvent.MIGRATION_COMPLETED)", + "id": "596aba5ac125cb65", "outputs": [], "execution_count": null }, { - "cell_type": "markdown", - "id": "04c856a8", "metadata": {}, + "cell_type": "markdown", "source": [ "### Restore access to the \"origin\" MongoDB server\n", "\n", "This effectively un-does the access revocation that we did earlier." - ] + ], + "id": "a7cac4478a921c3d" }, { - "cell_type": "code", - "id": "9aab3c7e", "metadata": {}, + "cell_type": "code", "source": [ "shell_command = f\"\"\"\n", " {cfg.mongosh_path} {origin_mongo_cli_base_options} \\\n", @@ -794,6 +742,7 @@ "completed_process = subprocess.run(shell_command, shell=True)\n", "print(f\"\\nReturn code: {completed_process.returncode}\")" ], + "id": "db70dca5eb1e31e7", "outputs": [], "execution_count": null }