How to run Newspaper in an Amazon Lambda function
How to run Newspaper (the Python 2.7 version) in an Amazon Lambda function:
sudo yum install gcc gcc-c++ libjpeg-devel zlib-devel libevent-devel libxml2-devel libxslt-devel libpng-devel
sudo yum install python27-devel python27-pip
virtualenv env
source env/bin/activate
sudo /usr/bin/easy_install lxml
pip install newspaper
nano env/local/lib/python2.7/site-packages/newspaper/settings.py
DATA_DIRECTORY
variable value to '/tmp/.newspaper_scraper'
zip -9 bundle.zip lambda_function.py
cd $VIRTUAL_ENV/lib/python2.7/site-packages
zip -r9 ~/bundle.zip *
cd $VIRTUAL_ENV/lib64/python2.7/site-packages
zip -r9 ~/bundle.zip *
bundle.zip
file to your Lambda function
lambda_function.lambda_handler
from newspaper import Article
def lambda_handler(event, context):
url = event['url']
article = Article(url)
article.download()
article.parse()
return {
'content' : article.text
}