Skip to content

Concurrent XML Link Validator: Go tool for efficient HTTP link validation from XML files, optimized for SiteMap format.

License

Notifications You must be signed in to change notification settings

NPZlatu/XMLLinksValidator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

XMLLinksValidator

A tool that is written in Golang that concurrently fetches and validates the http links from the XML files (SiteMap currently). It can be further enhanced to validate any links in an xml file.

Version

  • Go 1.14.2

Features

  • Sitemap & Product Feed Validation Options
  • Check for 200 response & errors
  • Also, check for links which are inactive/inaccessible even though their status code is 200
  • Generates the output CSV file with format: URL, StatusCode, Validity, ErrorMessage
  • Customizable Concurrency level

Project Structure

  xml-links-validator
    ├── ...
    └── dedupe                         # project for link validity test
        ├── dedupe.go                  # fetches all the unique links from the xml file and output them into txt file
    └── txtchecker
        ├── txtchecker.go              # validates all the unique links from feed.txt file and output the result into csv file
    └── Sitemap.xml                    # sample sitemap (replace this with your Sitemap XML)

Running the project

Golang needs to be installed in order to run the project.

  • Replace Sitemap.xml with the sitemap that needs to be validated
  • Run dedupe.go command which fetches all the links into one text file
  • Run txtchecker.go command which validates all the links & outputs csv file
  • Note: Concurrency level can be changed by updating workers value in txtchecker.go file. Default is 8
    cd dedupe && go run dedupe.go
    cd ..
    cd txtchecker && go run txtchecker.go

For e.g.

    cd dedupe && go run dedupe.go -xml Sitemap

    cd ..
    cd txtchecker && go run txtchecker.go

About

Concurrent XML Link Validator: Go tool for efficient HTTP link validation from XML files, optimized for SiteMap format.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages