Web Scraping with Python Training Course
Web Scraping is a technique for extracting data from websites and saving it to local files or databases.
This instructor-led, live training (online or onsite) is aimed at developers who wish to use Python to automate the process of crawling multiple websites to extract data for processing and analysis.
By the end of this training, participants will be able to:
- Install and configure Python and all relevant packages.
- Retrieve and parse data stored across various websites.
- Understand how websites function and how their HTML is structured.
- Construct spiders to crawl the web at scale.
- Use Selenium to crawl AJAX-driven web pages.
Format of the Course
- Interactive lecture and discussion.
- Extensive exercises and practice sessions.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- This course assumes prior knowledge of programming.
- To request a customized training for government, please contact us to arrange.
Course Outline
Introduction
Setting Up the Development Environment for Government
Python Primer: Data Structures, Conditionals, File Handling, etc.
Python Packages for Web Scraping for Government: Scrapy and BeautifulSoup
Understanding How a Website Works for Government Operations
Structure of HTML for Government Websites
Making a Web Request for Government Applications
Scraping an HTML Page for Government Data Collection
Working with XPath and CSS Selectors for Government Projects
Filtering Data Using Regular Expressions for Government Analysis
Creating a Web Crawler for Government Use
Crawling AJAX and JavaScript Pages with Selenium for Government Applications
Web Scraping Best Practices for Government Compliance
Troubleshooting Common Issues in Government Web Scraping
Summary and Conclusion for Government Operations
Requirements
- Programming experience, preferably in Python. If participants have programming experience in a language other than Python, the training can be extended to include additional introductory Python exercises for government.
Audience
- Developers
Runs with a minimum of 4 + people. For 1-to-1 or private group training, request a quote.
Web Scraping with Python Training Course - Booking
Web Scraping with Python Training Course - Enquiry
Web Scraping with Python - Consultancy Enquiry
Testimonials (1)
Many different examples and topics has been covered, from basic investigation to login management and dynamic page management.
Daniele Tagliaferro - Creditsafe Italia Srl
Course - Web Scraping with Python
Upcoming Courses
Related Courses
Scaling Data Analysis with Python and Dask
14 HoursData Analysis with Python, Pandas and Numpy
14 HoursThis instructor-led, live training in US (online or onsite) is aimed at intermediate-level Python developers and data analysts who wish to enhance their skills in data analysis and manipulation using Pandas and NumPy for government applications.
By the end of this training, participants will be able to:
- Set up a development environment that includes Python, Pandas, and NumPy.
- Create a data analysis application using Pandas and NumPy for government workflows.
- Perform advanced data wrangling, sorting, and filtering operations.
- Conduct aggregate operations and analyze time series data.
- Visualize data using Matplotlib and other visualization libraries.
- Debug and optimize their data analysis code to ensure compliance with public sector governance standards.
FARM (FastAPI, React, and MongoDB) Full Stack Development
14 HoursDeveloping APIs with Python and FastAPI
14 HoursMachine Learning with Python – 2 Days
14 HoursThe objective of this course is to provide a foundational proficiency in applying Machine Learning methods in practice. Through the use of the Python programming language and its various libraries, and based on numerous practical examples, this course teaches participants how to utilize the most essential components of Machine Learning, make informed data modeling decisions, interpret algorithm outputs, and validate results.
Our goal is to equip you with the skills necessary to confidently understand and use the core tools from the Machine Learning toolbox, while avoiding common pitfalls in Data Science applications. This course is designed to enhance your capabilities for government workflows, ensuring alignment with public sector governance and accountability standards.
Machine Learning with Python – 4 Days
28 HoursThe objective of this course is to enhance proficiency in applying Machine Learning methods in practical scenarios. Utilizing the Python programming language and its various libraries, and through a wide range of practical examples, this course instructs participants on how to effectively use key Machine Learning components, make informed data modeling decisions, interpret algorithm outputs, and validate results.
Our goal is to equip you with the skills necessary to confidently understand and utilize the essential tools from the Machine Learning toolbox, while avoiding common pitfalls in Data Science applications for government.
Accelerating Python Pandas Workflows with Modin
14 HoursPython for Natural Language Generation (NLG)
21 HoursIn this instructor-led, live training in US, participants will learn how to use Python to produce high-quality natural language text by building their own NLG system from scratch. Case studies relevant to public sector workflows and governance will be examined, and the concepts will be applied to hands-on lab projects for generating content tailored for government.
By the end of this training, participants will be able to:
- Utilize NLG to automatically generate content for various industries, including journalism, real estate, weather reporting, and sports, with a focus on applications for government.
- Select and organize source content, plan sentences, and prepare a system for the automatic generation of original content that aligns with public sector needs.
- Understand the NLG pipeline and apply the appropriate techniques at each stage to ensure compliance with governmental standards.
- Comprehend the architecture of a Natural Language Generation (NLG) system designed for government use.
- Implement the most suitable algorithms and models for analysis and ordering, ensuring they meet the requirements of public sector workflows.
- Pull data from publicly available data sources as well as curated databases to use as material for generated text that supports government operations.
- Replace manual and laborious writing processes with computer-generated, automated content creation that enhances efficiency and accountability in government tasks.
Advanced Machine Learning with Python
21 HoursPython: Automate the Boring Stuff
14 HoursThis instructor-led, live training in US is based on the popular book, "Automate the Boring Stuff with Python," by Al Sweigart. It is designed for beginners and covers essential Python programming concepts through practical, hands-on exercises and discussions. The focus is on learning to write code to significantly enhance productivity in office environments.
By the end of this training, participants will be able to program in Python and apply these new skills for government:
- Automating tasks by writing simple Python scripts.
- Creating programs that perform text pattern recognition using "regular expressions."
- Generating and updating Excel spreadsheets programmatically.
- Parsing PDFs and Word documents.
- Crawling websites to extract information from online sources.
- Developing programs that send out email notifications.
- Utilizing Python's debugging tools to quickly resolve bugs.
- Programmatically controlling the mouse and keyboard to automate repetitive actions.
Python Programming for Finance
35 HoursPython is a programming language that has gained significant popularity in the financial sector. Adopted by major investment banks and hedge funds, it is used to develop a wide array of financial applications, from core trading systems to risk management platforms.
In this instructor-led, live training, participants will learn how to use Python to develop practical applications for solving a variety of finance-related challenges.
By the end of this training, participants will be able to:
- Understand the fundamentals of the Python programming language
- Download, install, and maintain the best development tools for creating financial applications in Python
- Select and utilize the most appropriate Python packages and programming techniques to organize, visualize, and analyze financial data from various sources (CSV, Excel, databases, web, etc.)
- Build applications that address issues related to asset allocation, risk analysis, investment performance, and more
- Troubleshoot, integrate, deploy, and optimize a Python application
Audience
- Developers
- Analysts
- Quants
Format of the course
- Part lecture, part discussion, exercises, and extensive hands-on practice
Note
- This training is designed to provide solutions for some of the primary challenges faced by finance professionals. If you have a specific topic, tool, or technique that you would like to cover or expand upon, please contact us to arrange.
Govtra offers this course for government personnel and organizations looking to enhance their financial technology capabilities and align with public sector workflows, governance, and accountability standards.