Week 7: Web Scraping and HTML Creation

This week’s emphasis has been on scraping websites and creating HTML with Python. During the individual lab, we had to build off of a previous lab (Lab 14), in which we created a frequency dictionary of all the unique words in a modified version of Dr. Seuss’ Green Eggs and Ham.

This time, however, instead of merely printing out the words and their counts, we had to display each word in an HTML file, altering the size and color of the text based on the frequency the word occurs in the story. My first approach was to use the frequency count as the font size. However, many words had a frequency of 1 or 2, which made them quite hard to see since their font size was being set to 1 or 2 pixels. Therefore, I decided to add 10 to font sizes, so that the minimum font size would be 10. I was also able to use the frequency to influence the greenness of the font of each word. To do this, I had to convert frequencies (integers) to hexadecimal and insure that each hexadecimal number was two digits to represent the green in the RGB hex code. I wound up scaling the frequencies so that the most frequent word would get the max amount of greenness (255 or FF in hexadecimal).

 

We also worked on our final project! Here’s a screen cap:

Leave a Reply

Your email address will not be published. Required fields are marked *