Hunting for data on the internet is a challenge in itself, but once you've found your elusive data, wrangling it into something usable can prove challenging, frustrating and incredibly time-consuming. But it doesn't have to be this way, thanks to web scrapers.
A web scraper (or data scraper) is a tool that automatically scours a web page for data, plucks it out, automatically formats it into something usable, and saves the output into a file. Basically, you tell the scraper where to go, and leave the rest to the tool.
Scraping is something data journalists use frequently as well as practitioners of data visualization. There is one tool we use often and, since we limit our scraped to less than a thousand runs a month, it's completely free. It's revolutionized our interactions with Wikipedia!
The tool we use is Import.io. This scraper has a nice visual interface so there is no need to code anything or write complex syntax. You simply enter the URL of the page to be scraped, and let the tool process the page. After processing, you confirm, edit or manually provide the scraper with information to be gathered. I've found that on more complex datasets, it's worth manually pointing Import.io to the actual data - a process that takes tens of seconds. It's so easy. You can even schedule daily runs of your scrape.
If you trawl the web for numbers, tables, and stuff to build spreadsheets and charts out of, using a scraper is going to make your life so much easier. Trust us.