nisham22
1/4/2018 - 5:12 AM

Scrapy

To map the volume

--volume <pwd>/nginx_confs/nginx.conf:/etc/nginx/nginx.conf:ro

nginx.conf

user  nginx;
worker_processes  4;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;

events {
    worker_connections  1024;
}

stream {
    upstream dtr_8050 {
        least_conn;
        server localhost:8051  max_fails=1 fail_timeout=2s;
        server localhost:8052  max_fails=1 fail_timeout=2s;
    server localhost:8053  max_fails=1 fail_timeout=2s;
    server localhost:8054  max_fails=1 fail_timeout=2s;
    server localhost:8055  max_fails=1 fail_timeout=2s;
    }    
    server {
        listen 8050;
        proxy_pass dtr_8050;
    }
}

To start Splash instance

sudo docker run -d -p 8051:8050 --memory=1.5G --restart=unless-stopped -v /home/jalal/splash_filters:/etc/splash/filters   scrapinghub/splash:2.3.3  --slots 8  --maxrss 800  --max-timeout 3600 -v1

To start nginx

sudo docker run --detach \
  --name dtr-lb \
  --restart=unless-stopped \
  --publish 8050:8050 \
  --volume /home/jalal/nginx_confs/splash.conf:/etc/nginx/nginx.conf:ro \
  --network host \
  nginx:stable-alpine

To start Squid (HTTPs cache)

docker run  -d -h proxy.docker.dev --expose=3128   --name squid3 'squid3:latest'

To get inside Squid (A Docker container with bash)

docker exec -i -t squid3  /bin/bash (use IFCONFIG and get IP)