[go: up one dir, main page]

Curl

Check if a static site is moved correctly

The lazy person’s guide to confirming that a move to a static site worked. Overview: Download all relevant URLs from Search Console Convert download to a URL list Check for http to https redirects Check for valid final URLs Download all relevant URLs I’m picking one approximate source of truth - the URLs that received impressions in Google Search. This list doesn’t need to be comprehensive, just something more than I’d manually pick.

Crawling all (most) of the web's robots.txt comments

Starting from this tweet … View tweet … I hacked together a few-lined robots.txt comment parser. I thought it was fun enough to drop here. Crawling the web for all robots.txt file comments curl -s -N http://s3.amazonaws.com/alexa-static/top-1m.csv.zip \ >top1m.zip && unzip -p top1m.zip >top1m.csv while read li; do d=$(echo $li|cut -f2 -d,) curl -Ls -m10 $d/robots.txt | grep "^#" | sed "s/^/$d: /" | tee -a allrobots.txt done < top1m.csv The first line (curl -s .

Using Curl to add rows to a Google Spreadsheet without using an API

Adding content to a Google Spreadsheet usually requires using the Spreadsheet API (archive.org), getting auth tokens, and tearing out 42 pieces of hair or more. If you just want to use Google Spreadsheets to log some information for you (append-only), a simple solution is to use a Google Form (archive.org) to submit the data. To do that, you just need to POST data using the field names, and you’re done. The data is stored in your spreadsheet, you even get a timestamp for free.