jookyboi
10/3/2013 - 5:42 PM

Extract content from webpages using Pismo gem with Mechanize. From http://stackoverflow.com/questions/14283974/what-ruby-gem-provides-the-fu

agent = WWW::Mechanize.new
agent.get("http://www.awesomeblog.com/amazing-article")

scraper.text = MyScraper.new(:text => Pismo::Document.new(agent.url))

while agent.page.link_with("rel='next'").click do
  pismo_doc = Pismo::Document.new(agent.url)
  scraper.text << pismo_doc.lede
end

scraper.save!