Requirements
- A Bourne Again Shell (BASH) terminal emulator (e.g. Gnome Terminal)
- W3.org’s html-xml-utils - These can be installed by typing
apt install html-xml-utilsin your terminal emulator - The movie.sh and process_movies.sh files
How to run the script
- Download the both scripts from Github. Keep them in the same directory.
- Open the terminal emulator
- Change Directory (CD) into the directory that contains the movie.sh shell script
- Permission both scripts to be run by typing
CHMOD u+x movie.shandCHMOD u+x process_movies.sh - Execute the script
./movie.sh
Tools and techniques used
The following bash tools and techniques are used for text processing:
- Wget - Retrieves web pages or files via HTTP, HTTPS or FTP
- Grep - Searches files for lines that match a given pattern
- Sed - A stream editor that is used to perform basic text transformations such as, selecting text between two patterns.
- Hxnormalize - Part of the html-xml-utils package. Used for normalizing HTML.
- Hxselect - Part of the html-xml-utils package. Used for selecting selecting specific HTML tags.
- IFS - Internal field separator
- Redirecting
- Pipes - Used to redirect the output from one command as input for another or
- File redirects
Works Cited
https://www.ss64.com/bash
https://www.w3.org/Tools/HTML-XML-utils/
http://www.imdb.com/