beatrice-m
12/4/2017 - 3:36 AM

IS Pipeline

What to do to install and maintain the IS pipeline≈

# Install Homebrew
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

# Install Java
brew install java

# Install Scala
brew install scala

# Install Spark
# Check that the version installed is the same than on AWS EMR
# https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-spark.html
brew install apache-spark


# Install Python 3
# Yes, Python 3, and not 2.7 - Live with the times!
brew install python3


# Install Snappy
brew install snappy

# Install Pip
brew install pip3

# Install AWS Command Line Interface
pip3 install awscli 

# Install python packages
pip3 install bs4
pip3 install numpy
pip3 install pandas
pip3 install matplotlib
pip3 install boto3
pip3 install pyspark
pip3 install parquet
pip3 install pyarrow

# List of things to install

- Eclipse
- Scala for Eclipse
- PyDev for Eclipse
- Tableau