Skip to content

yorkulibraries/web-archiving-cron-scripts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YUL Web Archiving Cron Scripts

Description

This is a collection of shell scripts to capture, preserve, and replay York University and Government of Canada websites using Browsertrix Crawler and pywb.

Usage

Add to cron.

Ex:

05 09 * * 1 bash -c 'yulWA --name "yfile" --crawl-config "/crawl-configs/yu-yfile.yaml" --crawl-dir "/browsertrix" --dedup-dir "/dedup" --import-dir "/import" --workers 8 --version 1.6.0 > /dev/null 2>&1'

License

Public Domain

CC0

About

YUL Web archiving scripts

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •  

Languages