Web Scraping is a method for extracting textual characters from websites so that they could be analyzed. Web scraping is sort of content mining, which means that you collect useful information from websites, including quotes, prices, news company info, etc.This method for gathering data is direct, either through looking at websites' html code or visual abstraction techniques using Python programming launguage.
Voted most interesting course in NYC
We start workshop by exploring different methods to gather data from Web. We go through the whole process of gathering, storing and analyzing data. For our examples we use real-life financial quotes and Annual reports 10-K. During the course we learn how to use numerous Python libraries - Urllib, Requests, Wget, BeautifulSoup 4.0, SSL, PDFminer3k, Twitter and others.
Also, we learn to constract Regular expressions patterns to find targeted information on Web pages. As a part of content mining, we build Twitter application to search and analyze the trends. Tp identify and tag parts of speech found in text, we use The Natural Language Toolkit (NLTK) is a suite of Python libraries designed to perform Natural Language Analysis.
You will learn:
- Pattern matching with regular expressions
- BeautifulSoup 4
- Urllib, Requests, wget, Python Libraries
- Web Scraping and Web crawling
- Data storing with CSV files and SQL database
- Build Twitter App
- Chrome DevTools and HTML Tags
Prerequisites & Preparation:
- Python 3.5 or 3.6
- Text editor Sublime or any other one