Skip to content

Latest commit

 

History

History
182 lines (112 loc) · 5.07 KB

web.md

File metadata and controls

182 lines (112 loc) · 5.07 KB

Web

PaaS - Platform as a Service

See the PaaS doc.

SSG - Static Site Generators

Generate static HTML pages from code or markdown. Jekyll is an obvious example, used with GitHub Pages.

List of SSGs:

Jekyll

https://jekyllrb.com/

https://github.com/jekyll/jekyll

GitHub Pages has native support for Jekyll.

Written in Ruby.

See this repo: HariSekhon/CI-CD

And this resulting GitHub Page: https://harisekhon.github.io/CI-CD/

Hugo

https://gohugo.io/

https://github.com/gohugoio/hugo

Written in Go.

Faster and simpler.

Netlify

Builds Jekyll from GitHub repo integration for CI/CD upon pushes.

https://harisekhon.netlify.app/

Web Scrapers

  • FlyScrape - standalone scraping tool using Javascript configurations
  • Scrapy - Python web scraping library

Flyscrape

https://flyscrape.com

https://github.com/philippta/flyscrape

"Doesn't require advanced programming skills"

  • but it does require some basic Javascript programming to fill in a config.js file of what to extract and return
    • jQuery or cheerio-like API selecting HTML elements
  • can access cookie stores from browsers
  • Browser / Javascript rendering for complex websites
    • can launch Chromium browsers to materialize the page and then scrape the resulting HTML
  • outputs in JSON for further processing

Install

On Mac:

brew install flyscrape

or

curl -fsSL https://flyscrape.com/install | bash

Config

Create a new config:

flyscrape new flyscrape.config.js

or use a ready-to-run example from HariSekhon/Templates:

wget -cO flyscrape.config.js https://raw.githubusercontent.com/HariSekhon/Templates/refs/heads/master/flyscrape.config.js

Interactive Config Development

Run this and then edit the file for live terminal updates of what it is extracting:

flyscrape dev flyscrape.config.js

Run

flyscrape run flyscrape.config.js

Diagrams

From the HariSekhon/Diagrams-as-Code repo:

Web Basics

Web Basics

AWS Web Traffic Classic

AWS Web Traffic Classic

AWS Load Balanced Web Farm

AWS Load Balanced Web Farm

AWS Clustered Web Services

AWS Clustered Web Services

Web MySQL Replica Architecture

Web MySQL Replica Architecture

Advanced Web Services Open Source

Advanced Web Services Open Source

Cloudflare and Kubernetes Web Architecture

Cloudflare and Kubernetes Web Architecture

Multi-Datacenter Web Stack

Multi-Datacenter Web Stack

Kubernetes Traefik Web Architecture

Kubernetes Traefik Web Architecture

Kubernetes Kong Web Architecture

Kubernetes Kong Web Architecture

Rest vs GraphQL

Rest vs GraphQL