-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
External data #2795
External data #2795
Conversation
WalkthroughThis update changes several permissions and configuration settings across the API. The Changes
Sequence Diagram(s)sequenceDiagram
participant DF as Datasets Function
participant AH as getAiresLastUrl
participant CH as getCampaignsLastUrl
participant API as External API
DF->>AH: Call getAiresLastUrl(AiresUrl)
AH->>API: Fetch data from Aires URL
API-->>AH: Return response
AH->>DF: Return airesResponse (with error handling)
DF->>CH: Call getCampaignsLastUrl(CampaignsUrl)
CH->>API: Fetch data from Campaigns URL
API-->>CH: Return response
CH->>DF: Return campaignsResponse (with error handling)
sequenceDiagram
participant Client as Caller
participant Crypto as Crypto Module
Client->>Crypto: Call sha256sum(source)
Crypto->>Crypto: Invoke shaSum(source, "SHA-256")
Crypto-->>Client: Return hex encoded hash
Client->>Crypto: Call sha1sum(source)
Crypto->>Crypto: Invoke shaSum(source, "SHA-1")
Crypto-->>Client: Return hex encoded hash
Possibly related PRs
Suggested reviewers
Poem
Tip 🌐 Web search-backed reviews and chat
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms (1)
🔇 Additional comments (1)
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…urage into fix/external_data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🔭 Outside diff range comments (1)
api/src/lib/crypto/index.ts (1)
81-99
: Consider adding error handling for stream operations.The shaSum function should handle potential stream errors during digest operations.
export async function shaSum( source: string | ReadableStream<Uint8Array>, alg: stdCrypto.DigestAlgorithm = "SHA-256", ): Promise<string> { let stream; if (source instanceof ReadableStream) { stream = source; } else { if (!await exists(source)) { throw new Error(`File not found: ${source}`); } const file = await read(source); stream = file.readable; } - const hashBuffer = await stdCrypto.crypto.subtle.digest(alg, stream); - return encodeHex(hashBuffer); + try { + const hashBuffer = await stdCrypto.crypto.subtle.digest(alg, stream); + return encodeHex(hashBuffer); + } catch (error) { + throw new Error(`Failed to calculate ${alg} hash: ${error.message}`); + } }
🧹 Nitpick comments (1)
api/src/db/external_data/datasets/AiresCovoiturage.ts (1)
9-10
: Consider adding type annotations and documentation.The new static properties would benefit from TypeScript type annotations and JSDoc comments explaining their purpose.
- static sha256 = undefined; - static filename = undefined; + /** SHA-256 checksum of the dataset file */ + static sha256: string | undefined = undefined; + /** Name of the dataset file */ + static filename: string | undefined = undefined;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
api/deno.lock
is excluded by!**/*.lock
📒 Files selected for processing (11)
api/justfile
(1 hunks)api/src/db/cmd-source.ts
(1 hunks)api/src/db/external_data/datasets.ts
(2 hunks)api/src/db/external_data/datasets/AiresCovoiturage.ts
(1 hunks)api/src/db/external_data/datasets/IncentiveCampaigns.ts
(1 hunks)api/src/db/external_data/helpers.ts
(1 hunks)api/src/db/geo/buildMigrator.ts
(1 hunks)api/src/deps.ts
(1 hunks)api/src/lib/crypto/index.ts
(3 hunks)api/src/pdc/providers/storage/S3StorageProvider.ts
(0 hunks)api/src/pdc/services/dashboard/providers/TerritoriesRepositoryProvider.ts
(0 hunks)
💤 Files with no reviewable changes (2)
- api/src/pdc/providers/storage/S3StorageProvider.ts
- api/src/pdc/services/dashboard/providers/TerritoriesRepositoryProvider.ts
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Test
🔇 Additional comments (5)
api/src/db/cmd-source.ts (1)
13-13
: LGTM! Configuration simplified.The removal of the
datasets
property from the app configuration aligns with the PR objectives to fix external_data datasets configuration. The simplified configuration is now more focused.api/src/lib/crypto/index.ts (1)
73-79
: Add security warning for SHA-1 usage.While the implementation is correct, SHA-1 is cryptographically broken and should not be used for security purposes.
✅ Verification successful
🌐 Web query:
Is SHA-1 still considered secure for file integrity checks?
💡 Result:
SHA-1 is **no longer considered secure** for file integrity checks in most security-critical contexts due to demonstrated vulnerabilities, though limited non-security uses may still exist. Here's a detailed breakdown: ### Security Vulnerabilities - **Collision attacks**: Modern computing power enables attackers to create different files with identical SHA-1 hashes ([3][4][8]). For example, researchers generated two distinct PDFs with matching SHA-1 hashes in 2017 ([3][5]). - **Cost-effectiveness**: Renting cloud resources to execute these attacks costs as little as $75k–$120k ([5]), making them feasible for motivated adversaries. - **NIST deprecation**: The U.S. National Institute of Standards and Technology mandated phasing out SHA-1 by 2030, citing its inability to withstand modern threats ([1][2]). ### Limited Acceptable Uses 1. **Non-security contexts**: - Git and other version-control systems use SHA-1 purely for data consistency checks (not security), relying on additional safeguards ([3][6]). - HMAC-SHA-1 remains secure in some implementations where collision resistance isn’t required ([6][3]). 2. **Legacy systems**: - Older hardware/software without support for newer algorithms might still use SHA-1 temporarily ([6]). ### Recommendations | Algorithm | Hash Length | Security Status | Use Case Priority | |-----------|-------------|------------------|-------------------| | **SHA-256** | 256-bit | Secure | High (default choice) | | **SHA-3** | Variable | Secure | High (future-proof) | | **SHA-1** | 160-bit | Deprecated | Low (legacy only) | Organizations should prioritize migrating to **SHA-256** or **SHA-3** for file integrity checks, especially in environments handling sensitive data or requiring regulatory compliance ([4][8]). For non-critical internal systems still using SHA-1, monitor for collision risks and phase out usage before NIST’s 2030 deadline ([1][2]). Citations: - 1: https://www.nist.gov/news-events/news/2022/12/nist-retires-sha-1-cryptographic-algorithm - 2: https://www.securityweek.com/nist-retire-27-year-old-sha-1-cryptographic-algorithm/ - 3: https://en.wikipedia.org/wiki/Sha-1 - 4: https://www.securew2.com/blog/what-is-sha-encryption-sha-256-vs-sha-1 - 5: https://www.bitdefender.com/en-us/blog/hotforsecurity/migration-to-sha-2-inevitable-as-sha-1-is-broken - 6: https://www.assured.se/posts/what-the-sha1-retirement-means-for-you - 7: https://westoahu.hawaii.edu/cyber/cscc/best-practices/sha-1-and-md5-hashing/ - 8: https://www.wolfandco.com/resources/insights/the-end-of-sha-1-understanding-risks-and-what-is-next/ - 9: https://threatpicture.com/terms/sha1/ - 10: https://learn.microsoft.com/en-us/lifecycle/announcements/sha-1-signed-content-retired
Security Warning Needed for SHA-1 Usage
The implementation of
sha1sum
is functionally correct; however, as SHA-1 is widely recognized as cryptographically broken for security-critical applications, please add a security warning (or deprecation notice) near its declaration. This will help signal to users that while the function exists (perhaps for legacy support), SHA-1 should not be used for any security-sensitive file integrity checks.
- File:
api/src/lib/crypto/index.ts
- Location: Near the definition of
sha1sum
- Action: Add a comment or documentation note explaining that SHA-1 is deprecated for security purposes and recommend using more secure alternatives like SHA-256 or SHA-3 for critical usage.
api/src/db/external_data/datasets/IncentiveCampaigns.ts (1)
9-10
: Consider adding type annotations and documentation.The new static properties would benefit from TypeScript type annotations and JSDoc comments explaining their purpose, similar to AiresCovoiturage.ts.
- static sha256 = undefined; - static filename = undefined; + /** SHA-256 checksum of the dataset file */ + static sha256: string | undefined = undefined; + /** Name of the dataset file */ + static filename: string | undefined = undefined;api/justfile (1)
126-126
: Verify the removal of --allow-import permission.The change from --allow-import to --allow-run suggests the code no longer needs dynamic imports. Please verify this doesn't break any functionality.
✅ Verification successful
🏁 Script executed:
#!/bin/bash # Description: Check for dynamic imports in external data migration code # Test: Search for dynamic imports that might require --allow-import rg -g '*.ts' 'import\(' src/db/external_data/Length of output: 109
I'll run an additional search across the entire repository for dynamic import usage in JavaScript/TypeScript files to ensure that removing the
--allow-import
permission doesn't leave behind any code relying on dynamic imports.
🏁 Script executed:
#!/bin/bash echo "Searching for dynamic imports in TypeScript files:" rg -g "*.ts" "import\(" || true echo "-------------------------" echo "Searching for dynamic imports in JavaScript files:" rg -g "*.js" "import\(" || trueLength of output: 4809
Permission Change Verification: No functionality break detected
- A repository-wide search for dynamic import usages did not identify any reliance on the removed
--allow-import
permission within the external data migration context.- The dynamic
import(
occurrences found in the codebase pertain to unrelated functionality (e.g., lazy loading in routing and methods namedimport
) and do not require the explicit permission.- The change to use
--allow-run
in the migration command aligns with current needs and does not impact dynamic import behavior.api/src/deps.ts (1)
235-235
: LGTM! Good migration to JSR registry.The migration from Deno standard library to JSR registry for the crypto module is a positive change. JSR provides better versioning and security guarantees.
export async function getCampaignsLastUrl(url: string): Promise<string> { | ||
const response = await fetch(url, { method: "get" }); | ||
const res = await response.json(); | ||
if (!response.ok) { | ||
console.error(res.error.data); | ||
} | ||
const list = res ? res.resources : []; | ||
const fileUrl = list.length > 0 ? list[0].latest : ""; | ||
return fileUrl; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling and type safety in getCampaignsLastUrl.
Similar issues exist in this function:
- Errors are logged but execution continues.
- Response data structure is not type-safe.
- Code structure is very similar to getAiresLastUrl.
Consider this improved implementation:
+interface CampaignResource {
+ latest: string;
+}
+
+interface CampaignsResponse {
+ resources: CampaignResource[];
+ error?: {
+ data: string;
+ };
+}
+
export async function getCampaignsLastUrl(url: string): Promise<string> {
const response = await fetch(url, { method: "get" });
- const res = await response.json();
+ const res = await response.json() as CampaignsResponse;
if (!response.ok) {
- console.error(res.error.data);
+ throw new Error(`Failed to fetch Campaigns data: ${res.error?.data ?? 'Unknown error'}`);
}
- const list = res ? res.resources : [];
+ const list = res.resources;
const fileUrl = list.length > 0 ? list[0].latest : "";
+ if (!fileUrl) {
+ throw new Error('No valid Campaigns URL found');
+ }
return fileUrl;
}
Additionally, consider extracting common fetch logic into a shared helper:
async function fetchExternalData<T>(url: string, errorPrefix: string): Promise<T> {
const response = await fetch(url, { method: "get" });
const res = await response.json();
if (!response.ok) {
throw new Error(`${errorPrefix}: ${res.error?.data ?? 'Unknown error'}`);
}
return res as T;
}
userConfig.app as any, | ||
) as unknown as AppConfigInterface; | ||
|
||
const appConfig = { ...defaultConfig.app, ...userConfig.app } as AppConfigInterface; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify impact of switching to shallow merge.
The change from deep merge to shallow merge could affect nested properties in the app configuration. Please ensure that:
- No nested properties are expected in the app configuration.
- All required properties are properly merged.
Run this script to check for nested properties in app config:
🏁 Script executed:
#!/bin/bash
# Description: Check for nested properties in app config
# Test: Search for object property assignments in app config
rg -A 5 'app: \{' ./api/src/db/geo/
Length of output: 1375
Impact of shallow merge verified: Potential issues with nested properties detected.
Our review shows that the default configuration (in api/src/db/geo/config.ts
) defines an app
object with several nested properties (e.g., noCleanup
, targetSchema
, datasets
, datastructures
, sevenZipBinPath
). With the change from deep merging to shallow merging in buildMigrator.ts
, any user configuration provided via userConfig.app
will completely replace the corresponding nested objects from defaultConfig.app
rather than individually merging nested keys.
Please ensure that:
- User configurations always supply the complete nested structure when overriding defaults.
- There’s no expectation for a partial merge of nested properties.
If partial overrides are needed, consider switching back to a deep merge strategy.
export { | ||
decodeBase64, | ||
encodeBase64, | ||
} from "https://deno.land/std@0.224.0/encoding/base64.ts"; | ||
encodeHex, | ||
} from "jsr:@std/encoding@1"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify the impact of removing decodeBase64
.
The addition of encodeHex
aligns with the PR objectives for implementing SHA functions. However, ensure that the removal of decodeBase64
doesn't break any existing functionality.
Run the following script to check for any remaining usage of decodeBase64
:
🏁 Script executed:
#!/bin/bash
# Description: Check for any remaining usage of decodeBase64 function
# Test: Search for decodeBase64 usage. Expect: No occurrences.
rg "decodeBase64"
Length of output: 242
Impact Assessment for decodeBase64
Removal:
The search reveals that decodeBase64
is still imported and used in api/src/lib/crypto/index.ts
(e.g., decoding PEM contents). Removing it without addressing these references would break existing functionality.
- Action Required:
- Retain the export of
decodeBase64
inapi/src/deps.ts
until the crypto module is updated. - Alternatively, update the crypto code to replace
decodeBase64
with a suitable alternative if removal is necessary.
- Retain the export of
Description
sha256sum
etsha1sum
external_data
qui chargeaient tous les datasets geoChecklist
/api/specs
Merge
Je squash la PR et vérifie que le message de commit utilise la convention d'Angular :
Voir la convention de message de commit...
Types de commit
Le scope (optionnel) précise le module ou le composant impacté par le commit.
Summary by CodeRabbit
New Features
Refactor
Chores