Scraping web page title with wget on termianl

    >


During development, it is useful to scrape title tags and h1 tags when checking a little content of the link destination or checking for errors.

scraping from “yahoo.com” on terminal console using “wget”.

$  wget --quiet -O - yahoo.com | sed -n -e 's!.*<title>\(.*\)</title>.*!\1!p'
Yahoo


Define function

gettitle(){ wget --quiet -O - $1 | sed -n -e 's!.*<title>\(.*\)</title>.*!\1!p' ;}
  • usage
$ gettitle https://stackoverflow.com/questions/9312154/wget-page-title
shell - Wget page title - Stack Overflow

$ gettitle localhost
Apache2 Ubuntu Default Page: It works
  • add .bashrc
cat << EOT >> ~/.bashrc
# gettitle()
gettitle(){ wget --quiet -O - $1 | sed -n -e 's!.*<title>\(.*\)</title>.*!\1!p' ;}
EOT

References:
https://stackoverflow.com/questions/9312154/wget-page-title



お困りですか?この記事で紹介していることをマンツーマンで指導、解説、代行します。まずはお気軽にお問い合わせください。


Close Menu