Skip to content

A real-time product review scraper powered by Node.js, Puppeteer, and Gemini-1.5-flash, dynamically fetching reviews from e-commerce sites and streaming data via Server-Sent Events (SSE).

Notifications You must be signed in to change notification settings

neeel200/ReviewMiner-AI

Repository files navigation

AI-Powered Product Review Scraper

This project dynamically scrapes product reviews from any e-commerce site and streams the data back to the client in real-time using Server-Sent-Events (SSE). The scraper is powered by Node.js, Puppeteer, and the Gemini-1.5-flash model for intelligent review extraction.
Demo Ouput link :- https://drive.google.com/file/d/1OdlrUyjl0WiGewzCaXNfZ6XlnKC1YiNr/view?usp=sharing

Features:

  • Dynamic Review Scraping: Extracts reviews from any product page, adapting to various site structures.
  • Real-Time Streaming: Streams reviews back to the client page-by-page as they are scraped.
  • AI Integration: Leverages the Gemini 1.5-flash model for dynamic CSS identification and review extraction.

Tech Stack:

  • Backend: Node.js(with SSE), Puppeteer
  • AI Model: Gemini-1.5-flash

Install and RUN:
npm install && npx puppeteer browsers install chrome && npm run dev

API Details:
Endpoint (GET) https://reviewminer-ai.onrender.com/api/reviews?page={url}&fetchOnly=10

  • Description:- This api is used to get all reviews for the given product’s page. This api scrapes all reviews from the given page with the help of LLM (gemini) and returns the list of reviews back in the response.

  • Query_Params:-

page (Required) : This is the product’s page url.

fetchOnly (Optional) :This parameter states the number of reviews we want from the given product page url i.e if provided will fetch the first “n” reviews from the page.IF NOT provided then fetches all reviews.

WorkFlow diagram:- https://app.eraser.io/workspace/yNRP8IRFrbY2IP5nKQoB?origin=share

About

A real-time product review scraper powered by Node.js, Puppeteer, and Gemini-1.5-flash, dynamically fetching reviews from e-commerce sites and streaming data via Server-Sent Events (SSE).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published