Web Scraping with Python: Collecting Data from the Modern Web

Web Scraping with Python: Collecting Data from the Modern Web
By Ryan Mitchell

List Price: $31.99
Price: $27.19 Details

Availability: Usually ships in 24 hours
Ships from and sold by Amazon.com

51 new or used available from $18.30

Average customer review:
(46 customer reviews)

Product Description

Learn web scraping and crawling techniques to access unlimited data from any web source in any format. With this practical guide, you’ll learn how to use Python scripts and web APIs to gather and process data from thousands—or even millions—of web pages at once.

Ideal for programmers, security professionals, and web administrators familiar with Python, this book not only teaches basic web scraping mechanics, but also delves into more advanced topics, such as analyzing raw data or using scrapers for frontend website testing. Code samples are available to help you understand the concepts in practice.

  • Learn how to parse complicated HTML pages
  • Traverse multiple pages and sites
  • Get a general overview of APIs and how they work
  • Learn several methods for storing the data you scrape
  • Download, read, and extract data from documents
  • Use tools and techniques to clean badly formatted data
  • Read and write natural languages
  • Crawl through forms and logins
  • Understand how to scrape JavaScript
  • Learn image processing and text recognition

Product Details

  • Amazon Sales Rank: #60554 in Books
  • Published on: 2015-07-24
  • Released on: 2015-07-24
  • Original language: English
  • Number of items: 1
  • Dimensions: 9.19" h x .58" w x 7.00" l, 1.47 pounds
  • Binding: Paperback
  • 256 pages

Editorial Reviews

About the Author

Ryan Mitchell is a Software Engineer at LinkeDrive in Boston, where she develops their API and data analysis tools. She is a graduate of Olin College of Engineering, and is a Masters degree student at Harvard University School of Extension Studies. Prior to joining LinkeDrive, she was a Software Engineer working on web scraping and data analysis at Abine.

Customer Reviews

Most helpful customer reviews

42 of 43 people found the following review helpful.
5Highly recommended. I wish I had this book two years ago
By Gerard Cunningham
I really liked this book, for the following reasons:
1. It is a great introduction to web scraping. The reader is given confidence to use well-known Python packages such as BeautifulSoup and get useful results from scraping webpages in a very short time.
2. Where to go after learning the basics? - the author describes the tools, techniques and frameworks to use for scraping dynamic websites, including code examples. This is the most challenging part of the book because it frequently involves combining tools and the reader will have to get his/her hands dirty and learn by doing also. This is reasonable since different websites present different challenges.
3. I liked the author's writing style. She favors simple explanations, identifies potential pitfalls and makes clear, technical recommendations based on her experience.

Highly recommended. I wish I had this book two years ago.

3 of 3 people found the following review helpful.
5Good book with some good tips
By Johnny_Got_It
Good book with some good tips. Fairly basic but does touch on some advanced scraping techniques briefly. Author does a good job and I would absolutely recommend this to others looking to learn more about extracting info from web pages.

0 of 0 people found the following review helpful.
5Interesting and useful
By Mike B.
Well-written book on a slightly obscure subject that does have some real uses. If you need to grab stuff from a website with a Python script, this is the book for you! If you need to test a web application, there's good information on how to go about that in here too.

See all 46 customer reviews...

Connect with defaultLogic
What We've Done
Led Digital Marketing Efforts of Top 500 e-Retailers.
Worked with Top Brands at Leading Agencies.
Successfully Managed Over $50 million in Digital Ad Spend.
Developed Strategies and Processes that Enabled Brands to Grow During an Economic Downturn.
Taught Advanced Internet Marketing Strategies at the graduate level.

Warning: include(s2/ac/defaultlogic/modal.php): failed to open stream: No such file or directory in /home/adddocom/public_html/s2/lib/cm/dyn.php on line 33

Warning: include(s2/ac/defaultlogic/modal.php): failed to open stream: No such file or directory in /home/adddocom/public_html/s2/lib/cm/dyn.php on line 33

Warning: include(): Failed opening 's2/ac/defaultlogic/modal.php' for inclusion (include_path='.:/opt/alt/php55/usr/share/pear:/opt/alt/php55/usr/share/php') in /home/adddocom/public_html/s2/lib/cm/dyn.php on line 33

Manage research, learning and skills at defaultLogic. Create an account using LinkedIn or facebook to manage and organize your IT knowledge. defaultLogic works like a shopping cart for information -- helping you to save, discuss and share.

  Contact Us