Scraping Data from College Bookstores in the Hunt for Cheaper Textbooks

Edwards - Caterpillar "Power Parade"

Ben Greenberg and Rui Xia, the co-founders of Textyard, are moving on to other projects, but in doing so they've open sourced the code that powers the Textyard website, asserting that "any college student with rudimentary coding skills will now be able to take on their local bookstore."

Textyard is a textbook comparison site that makes it easier for students to do a little research before buying their textbooks at their local campus bookstore. The service is meant to help counter the high price of textbooks by giving students more options and information about where to purchase their books. The website lets you enter all your classes and sections for the semester, pulls the list of required and recommended textbooks, and gives you their price on Amazon, Chegg, eBay and other online outlets.

How does Textyard know which textbooks are required? It "scrapes" the college bookstores' websites, meaning that it programmatically harvests the data about courses and textbooks. As Textyard's Greenberg notes, many colleges use one of six major online storefronts, so the team has written the code that can extract the information from each.

Greenberg also argues that this isn't illegal, although in the past some web-scraping companies have been sued for "trespass to chattels" (most famously, perhaps when eBay sought an injunction to stop Bidder's Edge from scraping its site to display auction information). In the case of Textyard, Greenberg argues that course and textbook information must be available to all students and bookstores under the Higher Education Opportunity Act. He also contends that if scraping does not disrupt a website from functioning (and typically it doesn't), there's really no way to say that the tool damages its business.

You can read the rest of the story on Inside Higher Ed.

Photo credits: Roger Wollstadt



Tags: , ,