Building numpy and scipy with Intel MKL on a 64 bit machine
# This file provides configuration information about non-Python dependencies for
# numpy.distutils-using packages. Create a file like this called "site.cfg" next
# to your package's setup.py file and fill in the appropriate sections. Not all
# packages will use all sections so you should leave out sections that your
# package does not use.
# To assist automatic installation like easy_install, the user's home directory
# will also be checked for the file ~/.numpy-site.cfg .
# The format of the file is that of the standard library's ConfigParser module.
#
# http://www.python.org/doc/current/lib/module-ConfigParser.html
#
# Each section defines settings that apply to one particular dependency. Some of
# the settings are general and apply to nearly any section and are defined here.
# Settings specific to a particular section will be defined near their section.
#
# libraries
# Comma-separated list of library names to add to compile the extension
# with. Note that these should be just the names, not the filenames. For
# example, the file "libfoo.so" would become simply "foo".
# libraries = lapack,f77blas,cblas,atlas
#
# library_dirs
# List of directories to add to the library search path when compiling
# extensions with this dependency. Use the character given by os.pathsep
# to separate the items in the list. Note that this character is known to
# vary on some unix-like systems; if a colon does not work, try a comma.
# This also applies to include_dirs and src_dirs (see below).
# On UN*X-type systems (OS X, most BSD and Linux systems):
# library_dirs = /usr/lib:/usr/local/lib
# On Windows:
# library_dirs = c:\mingw\lib,c:\atlas\lib
# On some BSD and Linux systems:
# library_dirs = /usr/lib,/usr/local/lib
#
# include_dirs
# List of directories to add to the header file earch path.
# include_dirs = /usr/include:/usr/local/include
#
# src_dirs
# List of directories that contain extracted source code for the
# dependency. For some dependencies, numpy.distutils will be able to build
# them from source if binaries cannot be found. The FORTRAN BLAS and
# LAPACK libraries are one example. However, most dependencies are more
# complicated and require actual installation that you need to do
# yourself.
# src_dirs = /home/rkern/src/BLAS_SRC:/home/rkern/src/LAPACK_SRC
#
# search_static_first
# Boolean (one of (0, false, no, off) for False or (1, true, yes, on) for
# True) to tell numpy.distutils to prefer static libraries (.a) over
# shared libraries (.so). It is turned off by default.
# search_static_first = false
# Defaults
# ========
# The settings given here will apply to all other sections if not overridden.
# This is a good place to add general library and include directories like
# /usr/local/{lib,include}
#
#[DEFAULT]
#library_dirs = /usr/local/lib
#include_dirs = /usr/local/include
# Atlas
# -----
# Atlas is an open source optimized implementation of the BLAS and Lapack
# routines. Numpy will try to build against Atlas by default when available in
# the system library dirs. To build numpy against a custom installation of
# Atlas you can add an explicit section such as the following. Here we assume
# that Atlas was configured with ``prefix=/opt/atlas``.
#
# [atlas]
# library_dirs = /opt/atlas/lib
# include_dirs = /opt/atlas/include
# OpenBLAS
# --------
# OpenBLAS is another open source optimized implementation of BLAS and Lapack
# and can be seen as an alternative to Atlas. To build numpy against OpenBLAS
# instead of Atlas, use this section instead of the above, adjusting as needed
# for your configuration (in the following example we installed OpenBLAS with
# ``make install PREFIX=/opt/OpenBLAS``.
#
# [openblas]
# libraries = openblas
# library_dirs = /opt/OpenBLAS/lib
# include_dirs = /opt/OpenBLAS/include
[mkl]
library_dirs = /opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64:/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64
include_dirs = /opt/intel/composer_xe_2013_sp1.3.174/mkl/include
mkl_libs = mkl_rt
lapack_libs =
# MKL
#----
# MKL is Intel's very optimized yet proprietary implementation of BLAS and
# Lapack.
# For recent (9.0.21, for example) mkl, you need to change the names of the
# lapack library. Assuming you installed the mkl in /opt, for a 32 bits cpu:
# [mkl]
# library_dirs = /opt/intel/mkl/9.1.023/lib/32/
# lapack_libs = mkl_lapack
#
# For 10.*, on 32 bits machines:
# [mkl]
# library_dirs = /opt/intel/mkl/10.0.1.014/lib/32/
# lapack_libs = mkl_lapack
# mkl_libs = mkl, guide
# UMFPACK
# -------
# The UMFPACK library is used in scikits.umfpack to factor large sparse matrices.
# It, in turn, depends on the AMD library for reordering the matrices for
# better performance. Note that the AMD library has nothing to do with AMD
# (Advanced Micro Devices), the CPU company.
#
# UMFPACK is not needed for numpy or scipy.
#
# http://www.cise.ufl.edu/research/sparse/umfpack/
# http://www.cise.ufl.edu/research/sparse/amd/
# http://scikits.appspot.com/umfpack
#
#[amd]
#amd_libs = amd
#
#[umfpack]
#umfpack_libs = umfpack
# FFT libraries
# -------------
# There are two FFT libraries that we can configure here: FFTW (2 and 3) and djbfft.
# Note that these libraries are not needed for numpy or scipy.
#
# http://fftw.org/
# http://cr.yp.to/djbfft.html
#
# Given only this section, numpy.distutils will try to figure out which version
# of FFTW you are using.
#[fftw]
#libraries = fftw3
#
# For djbfft, numpy.distutils will look for either djbfft.a or libdjbfft.a .
#[djbfft]
#include_dirs = /usr/local/djbfft/include
#library_dirs = /usr/local/djbfft/lib
from __future__ import division, absolute_import, print_function
from distutils.unixccompiler import UnixCCompiler
from numpy.distutils.exec_command import find_executable
class IntelEM64TCCompiler(UnixCCompiler):
""" A modified Intel x86_64 compiler compatible with a 64bit gcc built Python.
"""
compiler_type = 'intelem'
cc_exe = 'icc -O3 -g -fPIC -fp-model strict -fomit-frame-pointer -openmp -xhost'
#cc_args = "-fPIC"
def __init__ (self, verbose=0, dry_run=0, force=0):
UnixCCompiler.__init__ (self, verbose, dry_run, force)
compiler = self.cc_exe
self.set_executables(compiler=compiler,
compiler_so=compiler,
compiler_cxx=compiler,
linker_exe=compiler,
linker_so=compiler + ' -shared')
# http://developer.intel.com/software/products/compilers/flin/
from __future__ import division, absolute_import, print_function
import sys
from numpy.distutils.ccompiler import simple_version_match
from numpy.distutils.fcompiler import FCompiler, dummy_fortran_file
compilers = ['IntelFCompiler', 'IntelVisualFCompiler',
'IntelItaniumFCompiler', 'IntelItaniumVisualFCompiler',
'IntelEM64VisualFCompiler', 'IntelEM64TFCompiler']
def intel_version_match(type):
# Match against the important stuff in the version string
return simple_version_match(start=r'Intel.*?Fortran.*?(?:%s).*?Version' % (type,))
class BaseIntelFCompiler(FCompiler):
def update_executables(self):
f = dummy_fortran_file()
self.executables['version_cmd'] = ['<F77>', '-FI', '-V', '-c',
f + '.f', '-o', f + '.o']
class IntelFCompiler(BaseIntelFCompiler):
compiler_type = 'intel'
compiler_aliases = ('ifort',)
description = 'Intel Fortran Compiler for 32-bit apps'
version_match = intel_version_match('32-bit|IA-32')
possible_executables = ['ifort', 'ifc']
executables = {
'version_cmd' : None, # set by update_executables
'compiler_f77' : [None, "-72", "-w90", "-w95"],
'compiler_f90' : [None],
'compiler_fix' : [None, "-FI"],
'linker_so' : ["<F90>", "-shared"],
'archiver' : ["ar", "-cr"],
'ranlib' : ["ranlib"]
}
pic_flags = ['-fPIC']
module_dir_switch = '-module ' # Don't remove ending space!
module_include_switch = '-I'
def get_flags_free(self):
return ["-FR"]
def get_flags(self):
return ['-fPIC']
def get_flags_opt(self):
#return ['-i8 -xhost -openmp -fp-model strict']
#return ['-xhost -openmp -fp-model strict']
return ['-i8 -xhost -openmp -fp-model strict -fPIC']
def get_flags_arch(self):
return []
def get_flags_linker_so(self):
opt = FCompiler.get_flags_linker_so(self)
v = self.get_version()
if v and v >= '8.0':
opt.append('-nofor_main')
if sys.platform == 'darwin':
# Here, it's -dynamiclib
try:
idx = opt.index('-shared')
opt.remove('-shared')
except ValueError:
idx = 0
opt[idx:idx] = ['-dynamiclib', '-Wl,-undefined,dynamic_lookup', '-Wl,-framework,Python']
return opt
class IntelItaniumFCompiler(IntelFCompiler):
compiler_type = 'intele'
compiler_aliases = ()
description = 'Intel Fortran Compiler for Itanium apps'
version_match = intel_version_match('Itanium|IA-64')
possible_executables = ['ifort', 'efort', 'efc']
executables = {
'version_cmd' : None,
'compiler_f77' : [None, "-FI", "-w90", "-w95"],
'compiler_fix' : [None, "-FI"],
'compiler_f90' : [None],
'linker_so' : ['<F90>', "-shared"],
'archiver' : ["ar", "-cr"],
'ranlib' : ["ranlib"]
}
class IntelEM64TFCompiler(IntelFCompiler):
compiler_type = 'intelem'
compiler_aliases = ()
description = 'Intel Fortran Compiler for 64-bit apps'
version_match = intel_version_match('EM64T-based|Intel\\(R\\) 64|64|IA-64|64-bit')
#possible_executables = ['ifort', 'efort', 'efc']
possible_executables = ['ifort']
executables = {
'version_cmd' : None,
'compiler_f77' : [None, "-FI"],
'compiler_fix' : [None, "-FI"],
'compiler_f90' : [None],
'linker_so' : ['<F90>', "-shared"],
'archiver' : ["ar", "-cr"],
'ranlib' : ["ranlib"]
}
def get_flags(self):
#return ['-fPIC']
return ['-O3 -g -xhost -openmp -fp-model strict -fPIC']
def get_flags_opt(self):
#return ['-i8 -xhost -openmp -fp-model strict']
#return ['-xhost -openmp -fp-model strict']
#return ['-i8 -xhost -openmp -fp-model strict -fPIC']
return []
def get_flags_arch(self):
return []
# Is there no difference in the version string between the above compilers
# and the Visual compilers?
class IntelVisualFCompiler(BaseIntelFCompiler):
compiler_type = 'intelv'
description = 'Intel Visual Fortran Compiler for 32-bit apps'
version_match = intel_version_match('32-bit|IA-32')
def update_executables(self):
f = dummy_fortran_file()
self.executables['version_cmd'] = ['<F77>', '/FI', '/c',
f + '.f', '/o', f + '.o']
ar_exe = 'lib.exe'
possible_executables = ['ifort', 'ifl']
executables = {
'version_cmd' : None,
'compiler_f77' : [None, "-FI", "-w90", "-w95"],
'compiler_fix' : [None, "-FI", "-4L72", "-w"],
'compiler_f90' : [None],
'linker_so' : ['<F90>', "-shared"],
'archiver' : [ar_exe, "/verbose", "/OUT:"],
'ranlib' : None
}
compile_switch = '/c '
object_switch = '/Fo' #No space after /Fo!
library_switch = '/OUT:' #No space after /OUT:!
module_dir_switch = '/module:' #No space after /module:
module_include_switch = '/I'
def get_flags(self):
opt = ['/nologo', '/MD', '/nbs', '/Qlowercase', '/us']
return opt
def get_flags_free(self):
return ["-FR"]
def get_flags_debug(self):
return ['/4Yb', '/d2']
def get_flags_opt(self):
return ['/O2']
def get_flags_arch(self):
return ["/arch:IA-32", "/QaxSSE3"]
class IntelItaniumVisualFCompiler(IntelVisualFCompiler):
compiler_type = 'intelev'
description = 'Intel Visual Fortran Compiler for Itanium apps'
version_match = intel_version_match('Itanium')
possible_executables = ['efl'] # XXX this is a wild guess
ar_exe = IntelVisualFCompiler.ar_exe
executables = {
'version_cmd' : None,
'compiler_f77' : [None, "-FI", "-w90", "-w95"],
'compiler_fix' : [None, "-FI", "-4L72", "-w"],
'compiler_f90' : [None],
'linker_so' : ['<F90>', "-shared"],
'archiver' : [ar_exe, "/verbose", "/OUT:"],
'ranlib' : None
}
class IntelEM64VisualFCompiler(IntelVisualFCompiler):
compiler_type = 'intelvem'
description = 'Intel Visual Fortran Compiler for 64-bit apps'
version_match = simple_version_match(start='Intel\(R\).*?64,')
def get_flags_arch(self):
return ["/arch:SSE2"]
if __name__ == '__main__':
from distutils import log
log.set_verbosity(2)
from numpy.distutils.fcompiler import new_fcompiler
compiler = new_fcompiler(compiler='intel')
compiler.customize()
print(compiler.get_version())
[root@composer_xe_2013_sp1.3.174]# tree -L 3
.
|-- Documentation
| |-- csupport
| |-- en_US
| | |-- Release_Notes_C_2013SP1_L_EN.pdf
| | |-- Release_Notes_F_2013SP1_L_EN.pdf
| | |-- clicense
| | |-- compiler_c
| | |-- compiler_f
| | |-- credist.txt
| | |-- debugger
| | |-- flicense
| | |-- fredist.txt
| | |-- get_started_lc.htm
| | |-- get_started_lf.htm
| | |-- gs_resources
| | |-- ipp
| | |-- mkl
| | |-- ssadiag_docs
| | |-- tbb
| | `-- tutorials
| |-- fsupport
| |-- ippsupport
| |-- ja_JP
| | |-- Release_Notes_C_2013SP1_L_JA.pdf
| | |-- Release_Notes_F_2013SP1_L_JA.pdf
| | |-- clicense
| | |-- compiler_c
| | |-- compiler_f
| | |-- credist.txt
| | |-- debugger
| | |-- flicense
| | |-- fredist.txt
| | |-- get_started_lc.htm
| | |-- get_started_lf.htm
| | |-- gs_resources
| | |-- ipp
| | |-- mkl
| | |-- ssadiag_docs
| | |-- tbb
| | `-- tutorials
| |-- mklsupport
| `-- tbbsupport
|-- Samples
| |-- en_US
| | |-- C++
| | |-- Fortran
| | `-- mkl
| `-- ja_JP
| |-- C++
| |-- Fortran
| `-- mkl
|-- bin
| |-- compilervars.csh
| |-- compilervars.sh
| |-- compilervars_arch.csh
| |-- compilervars_arch.sh
| |-- compilervars_global.csh
| |-- compilervars_global.sh
| |-- debuggervars.csh
| |-- debuggervars.sh
| |-- iccvars.csh -> ./compilervars.csh
| |-- iccvars.sh -> ./compilervars.sh
| |-- idbvars.csh
| |-- idbvars.sh
| |-- ifortvars.csh -> ./compilervars.csh
| |-- ifortvars.sh -> ./compilervars.sh
| |-- intel64
| | |-- codecov
| | |-- fortcom
| | |-- fpp
| | |-- gdb-ia -> /opt/intel/composer_xe_2013_sp1.3.174/bin/intel64/gdb_py24
| | |-- gdb_py24 -> /opt/intel/composer_xe_2013_sp1.3.174/debugger/gdb/intel64/py24/bin/gdb-ia
| | |-- ia32.xrd
| | |-- icc
| | |-- icc.cfg
| | |-- icpc
| | |-- icpc.cfg
| | |-- idb
| | |-- idb.el
| | |-- idbc
| | |-- idbserver
| | |-- idbvars.csh
| | |-- idbvars.sh
| | |-- ifort
| | |-- ifort.cfg
| | |-- iidb
| | |-- inspxe-inject
| | |-- inspxe-runsc
| | |-- inspxe-wrap
| | |-- libiml_attr.so
| | |-- libintelremotemon.so
| | |-- loopprofileviewer.csh
| | |-- loopprofileviewer.sh
| | |-- map_opts
| | |-- mcpcom
| | |-- mic_extract
| | |-- prelink
| | |-- profdcg
| | |-- profmerge
| | |-- proforder
| | |-- tselect
| | |-- xiar
| | `-- xild
| |-- intel64_mic
| | |-- codecov
| | |-- fortcom
| | |-- fpp
| | |-- gdb-mic -> /opt/intel/composer_xe_2013_sp1.3.174/bin/intel64_mic/gdb_py24
| | |-- gdb_py24 -> /opt/intel/composer_xe_2013_sp1.3.174/debugger/gdb/intel64_mic/py24/bin/gdb-mic
| | |-- ia32.xrd
| | |-- icc
| | |-- icc.cfg
| | |-- icpc
| | |-- icpc.cfg
| | |-- idb.el
| | |-- idb_mpm
| | |-- idbc_mic
| | |-- idbvars.csh
| | |-- idbvars.sh
| | |-- ifort
| | |-- ifort.cfg
| | |-- iidb_mic
| | |-- libiml_attr.so
| | |-- libintelremotemon.so
| | |-- map_opts
| | |-- mcpcom
| | |-- mpm
| | |-- prelink
| | |-- profdcg
| | |-- profmerge
| | |-- proforder
| | |-- tselect
| | |-- x86_64-linux.env
| | |-- xiar
| | |-- xiar.cfg
| | |-- xild
| | `-- xild.cfg
| |-- loopprofileviewer.jar
| `-- sourcechecker
| |-- bin
| |-- config
| |-- lib
| `-- message
|-- compiler
| |-- include
| | |-- _intel_mf_runtime.h
| | |-- atomic
| | |-- atomicint.h
| | |-- bfp754.h
| | |-- bfp754_conf.h
| | |-- bfp754_functionnames.h
| | |-- bfp754_macros.h
| | |-- bfp754_types.h
| | |-- chkp.h
| | |-- cilk
| | |-- complex
| | |-- complex.h
| | |-- dfp754.h
| | |-- dvec.h
| | |-- emm_func.h
| | |-- emmintrin.h
| | |-- fenv.h
| | |-- float.h
| | |-- for_fpclass.for
| | |-- for_fpclass.h
| | |-- for_fpeflags.for
| | |-- for_fpeflags.h
| | |-- for_iosdef.for
| | |-- for_iosdef.h
| | |-- fordef.for
| | |-- fordef.h
| | |-- foriosdef.f90
| | |-- forreent.for
| | |-- forreent.h
| | |-- fvec.h
| | |-- ia32intrin.h
| | |-- ieee_arithmetic.f90
| | |-- ieee_exceptions.f90
| | |-- ieee_features.f90
| | |-- ifcore.f90
| | |-- ifestablish.f90
| | |-- iflport.f90
| | |-- iflposix.f90
| | |-- ifport.f90
| | |-- ifposix.f90
| | |-- immintrin.h
| | |-- intel64
| | |-- iso_c_binding.f90
| | |-- iso_fortran_env.f90
| | |-- istrconv.h
| | |-- ivec.h
| | |-- math.h
| | |-- mathimf.h
| | |-- mic
| | |-- mic_lib.f90
| | |-- mmintrin.h
| | |-- nmmintrin.h
| | |-- offload.h
| | |-- omp.h
| | |-- omp_lib.f90
| | |-- omp_lib.h
| | |-- pgouser.h
| | |-- pmmintrin.h
| | |-- smmintrin.h
| | |-- sse2mmx.h
| | |-- stdatomic.h
| | |-- tbk_traceback.h
| | |-- tgmath.h
| | |-- tmmintrin.h
| | |-- wmmintrin.h
| | |-- xmm_func.h
| | |-- xmm_utils.h
| | |-- xmmintrin.h
| | `-- zmmintrin.h
| |-- lib
| | |-- intel64
| | `-- mic
| `-- perf_headers
| `-- c++
|-- debugger
| |-- cdt
| | |-- features
| | `-- plugins
| |-- eclipse_src
| | `-- cdt.tar.gz
| |-- gdb
| | |-- LICENSES
| | |-- intel64
| | |-- intel64_mic
| | |-- src
| | `-- target
| |-- gui
| | |-- common
| | |-- ia32
| | `-- intel64
| |-- intel64
| | |-- locale
| | `-- printrules
| |-- lib
| | `-- intel64
| |-- mic
| | |-- lib
| | |-- locale
| | |-- printrules
| | `-- third_party
| |-- mpm
| | `-- bin
| `-- third_party
| |-- Apache-License
| `-- epl-v10.txt
|-- eclipse_support
| `-- cdt8.0
| `-- eclipse
|-- foldermap.sc.xml
|-- ipp
| |-- bin
| | |-- intel64
| | |-- ippvars.csh
| | `-- ippvars.sh
| |-- examples
| | `-- ipp-examples.tgz
| |-- include
| | |-- ipp.h
| | |-- ippac.h
| | |-- ippcc.h
| | |-- ippch.h
| | |-- ippcore.h
| | |-- ippcv.h
| | |-- ippdc.h
| | |-- ippdefs.h
| | |-- ippdi.h
| | |-- ippgen.h
| | |-- ippi.h
| | |-- ippj.h
| | |-- ippm.h
| | |-- ippr.h
| | |-- ipps.h
| | |-- ippsc.h
| | |-- ippvc.h
| | |-- ippversion.h
| | `-- ippvm.h
| |-- lib
| | |-- intel64
| | `-- mic
| `-- tools
| `-- intel64
|-- man
| |-- en_US
| | `-- man1
| `-- ja_JP
| `-- man1
|-- mkl
| |-- benchmarks
| | |-- linpack
| | `-- mp_linpack
| |-- bin
| | |-- ia32
| | |-- intel64
| | |-- mklvars.csh
| | `-- mklvars.sh
| |-- examples
| | |-- examples_cluster.tgz
| | |-- examples_core.tgz
| | |-- examples_f95.tgz
| | `-- examples_mic.tgz
| |-- include
| | |-- blas.f90
| | |-- fftw
| | |-- i_malloc.h
| | |-- intel64
| | |-- lapack.f90
| | |-- mic
| | |-- mkl.fi
| | |-- mkl.h
| | |-- mkl_blacs.h
| | |-- mkl_blas.fi
| | |-- mkl_blas.h
| | |-- mkl_cblas.h
| | |-- mkl_cdft.f90
| | |-- mkl_cdft.h
| | |-- mkl_cdft_types.h
| | |-- mkl_df.f90
| | |-- mkl_df.h
| | |-- mkl_df_defines.h
| | |-- mkl_df_functions.h
| | |-- mkl_df_types.h
| | |-- mkl_dfti.f90
| | |-- mkl_dfti.h
| | |-- mkl_dss.f77
| | |-- mkl_dss.f90
| | |-- mkl_dss.fi
| | |-- mkl_dss.h
| | |-- mkl_dss_pardiso.h
| | |-- mkl_lapack.fi
| | |-- mkl_lapack.h
| | |-- mkl_lapacke.h
| | |-- mkl_pardiso.f77
| | |-- mkl_pardiso.f90
| | |-- mkl_pardiso.fi
| | |-- mkl_pardiso.h
| | |-- mkl_pblas.h
| | |-- mkl_poisson.f90
| | |-- mkl_poisson.h
| | |-- mkl_rci.f90
| | |-- mkl_rci.fi
| | |-- mkl_rci.h
| | |-- mkl_scalapack.h
| | |-- mkl_service.f90
| | |-- mkl_service.fi
| | |-- mkl_service.h
| | |-- mkl_solver.f77
| | |-- mkl_solver.f90
| | |-- mkl_solver.fi
| | |-- mkl_solver.h
| | |-- mkl_solvers_ee.fi
| | |-- mkl_solvers_ee.h
| | |-- mkl_spblas.fi
| | |-- mkl_spblas.h
| | |-- mkl_trans.fi
| | |-- mkl_trans.h
| | |-- mkl_trig_transforms.f90
| | |-- mkl_trig_transforms.h
| | |-- mkl_types.h
| | |-- mkl_vml.f77
| | |-- mkl_vml.f90
| | |-- mkl_vml.fi
| | |-- mkl_vml.h
| | |-- mkl_vml_defines.h
| | |-- mkl_vml_functions.h
| | |-- mkl_vml_types.h
| | |-- mkl_vsl.f77
| | |-- mkl_vsl.f90
| | |-- mkl_vsl.fi
| | |-- mkl_vsl.h
| | |-- mkl_vsl_defines.h
| | |-- mkl_vsl_functions.h
| | |-- mkl_vsl_subroutine.fi
| | `-- mkl_vsl_types.h
| |-- interfaces
| | |-- blas95
| | |-- fftw2x_cdft
| | |-- fftw2xc
| | |-- fftw2xf
| | |-- fftw3x_cdft
| | |-- fftw3xc
| | |-- fftw3xf
| | `-- lapack95
| |-- lib
| | |-- intel64
| | `-- mic
| |-- tests
| | |-- tests_cluster.tgz
| | `-- tests_core.tgz
| `-- tools
| |-- builder
| `-- mkl_link_tool
|-- mpirt
| |-- bin
| | |-- intel64
| | `-- mic
| `-- lib
| |-- intel64
| `-- mic
|-- pkg_bin -> ./bin
|-- tbb
| |-- bin
| | |-- tbbvars.csh
| | `-- tbbvars.sh
| |-- examples
| | |-- GettingStarted
| | |-- common
| | |-- concurrent_hash_map
| | |-- concurrent_priority_queue
| | |-- graph
| | |-- index.html
| | |-- parallel_do
| | |-- parallel_for
| | |-- parallel_reduce
| | |-- pipeline
| | |-- task
| | |-- task_group
| | |-- task_priority
| | `-- test_all
| |-- include
| | |-- index.html
| | |-- serial
| | `-- tbb
| |-- index.html
| `-- lib
| |-- android
| |-- ia32
| |-- intel64
| `-- mic
|-- uninstall
| |-- 32
| | |-- gcc-3.2
| | |-- install_cli.32
| | |-- install_gui.32
| | |-- libxml2.so.2
| | |-- libz
| | `-- qt
| |-- 32e
| | |-- gcc-3.2
| | |-- install_cli.32e
| | |-- install_gui.32e
| | |-- libxml2.so.2
| | |-- libz
| | `-- qt
| |-- cluster_install.cab
| |-- compiler_symlinks_layer.cab
| |-- images
| | `-- finish_jp.png
| |-- licenses
| | |-- libstdc++
| | `-- libxml
| |-- markers
| |-- mediaconfig.xml
| |-- phonehome.cab
| `-- uninstall.ini
|-- uninstall.sh
`-- uninstall_GUI.sh
168 directories, 293 files
[root@composer_xe_2013_sp1.3.174]#
cd /opt
[root@intel]# tree -L 2
.
|-- composer_xe_2013_sp1.3.174
| |-- Documentation
| |-- Samples
| |-- bin
| |-- compiler
| |-- debugger
| |-- eclipse_support
| |-- foldermap.sc.xml
| |-- ipp
| |-- man
| |-- mkl
| |-- mpirt
| |-- pkg_bin -> ./bin
| |-- tbb
| |-- uninstall
| |-- uninstall.sh
| `-- uninstall_GUI.sh
|-- ism
| `-- rm
`-- licenses
|-- intel_c_compiler.lic
`-- intel_fortran.lic
At work we make heavy use of the Python/numpy/scipy stack.
currently I am using ::
NumPy version 1.8.0
SciPy version 0.14.0
We have built numpy and scipy against Intel’s MKL, with Intel’s C++ compiler icc and Intel’s Fortran compiler ifort. As far as I know, and also from my own experience, this combination of tools lets you get close-to-optimum performance for numerical simulations on classical hardware, especially for linear algebra calculations. If you get the build right, and write proper code, i.e. use numpy’s data types and numpy’s/scipy’s functions the right way, then — spoken in general terms — no commercial or other open source software package is able to squeeze more performance from your hardware. I just updated the installation to the most recent Python 2 (2.7.3), numpy 1.8.0 and scipy 0.13.3. I built on a 64 bit machine with the Intel suite version 14.0.3.174 — newer versions of the Intel suite have troubles with respect to resolving library dependencies (which is the topic of this blog post). The documentation on the topic of building numpy and scipy with the Intel suite is sparse, so I describe the procedure I took in this article.
A few notes on performance and version decisions: compiling numpy and scipy against MKL provides a significant single-thread performance boost over classical builds. Furthermore, building against Intel’s OpenMP library can provide extremely efficient multicore performance through automatic threading of normal math operations. These performance boosts are significant. You want this boost, believe me, so it makes a lot of sense to build numpy/scipy with the Intel suite. However, it does not really make sense to build (C)Python with the Intel suite. It is not worth the effort — somewhere I have read that it might even be slower than an optimized GCC build. In any case, Python is not so much about math. It is about reliability and control. We should do math in numpy/scipy and use optimized builds for them — then there simply is no significance in optimizing the Python build itself. Regarding the Intel suite, I am pretty sure that newer versions do not provide large performance improvements compared to the 14.0.3.174 one. These are the reasons why I built things the following way:
CPython 2.7.3 in classical configure/make/make install fashion, with the system’s GCC.
numpy and scipy with Intel suite (MKL, icc, ifort) version 14.0.3.174.
Prerequisites
[root@]# ifort -V
Intel(R) Fortran Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 14.0.3.174 Build 20140422
Copyright (C) 1985-2014 Intel Corporation. All rights reserved.
FOR NON-COMMERCIAL USE ONLY
[root@]# icc -V
Intel(R) C Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 14.0.3.174 Build 20140422
Copyright (C) 1985-2014 Intel Corporation. All rights reserved.
FOR NON-COMMERCIAL USE ONLY
I assume that you have built Python as a non-privileged user and installed it to some location in your file system (btw: never just overwrite your system’s Python with a custom built!). Set up the environment for this build, so that
$ python
invokes it. You can validate this via $ python --version and $ command -v python. In the following steps, we will work with the same user used for building and installing Python: numpy and scipy installation files will go right into the directory tree of the custom Python build.
I also assume that you have a working copy of MKL, icc, ifort, and that you have set it up like this:
$ source /path_to_intel_compilers/bin/compilervars.sh intel64
After that, your PATH and LD_LIBRARY_PATH should be adjusted to contain those directories with all the Intel stuff inside. You should validate this. This is the icc version I am running:
$ icc --version
icc (ICC) 14.0.3.174 20120212
Prepare, build, install, and validate numpy
In the numpy source directory create the file site.cfg with the following content:
[mkl]
library_dirs = /path_to_intel_compilers/mkl/lib/intel64/
include_dirs = /path_to_intel_compilers/mkl/include/
mkl_libs = mkl_rt
lapack_libs =
That is right, no more stuff needed here. This is also recommended in http://software.intel.com/en-us/articles/numpy-scipy-with-mkl.
What is left to be done is setting compiler flags. The build process uses compiler abstractions stored in the two files
numpy/distutils/fcompiler/intel.py
and
numpy/distutils/intelccompiler.py
You might be frightened of these files, because they might appear so important. I am frightened of these files, because they are partly inconsistent in itself and have quite bloaty appearance not justified by their very very simple task. Do not be afraid to make some code reduction in order to be sure about what happens during the actual build.
In numpy/distutils/fcompiler/intel.py you can safely edit the get_flags* methods of the IntelEM64TFCompilerclass, so that they look like this:
class IntelEM64TFCompiler(IntelFCompiler):
compiler_type = 'intelem'
[...]
def get_flags(self):
return ['-O3 -g -xhost -openmp -fp-model strict -fPIC']
def get_flags_opt(self):
return []
def get_flags_arch(self):
return []
See, this compiler class describes itself as intelem and that is the compiler type we are going to use in the build command line. The “IntelEM64TFCompiler” has no meaning at all. We are just editing one of the classes, make sure that it has the compiler flags we want and eventually use this compiler abstraction during build. I have made sure that the compiler flags '-O3 -g -xhost -openmp -fp-model strict -fPIC' are used by making the method get_flags return them, and make all other related methods return nothing. Additionally (not shown above), I have also edited the possible_executables attribute of the class:
possible_executables = ['ifort']
just to make sure that ifort will be used. I have also removed all compiler classes not needed from that file, just to get a better overview, but all that is not required.
Next, edit numpy/distutils/intelccompiler.py. I have deleted tons of stuff, this is *all* the content remaining, you can use it like this:
from __future__ import division, absolute_import, print_function
from distutils.unixccompiler import UnixCCompiler
from numpy.distutils.exec_command import find_executable
class IntelEM64TCCompiler(UnixCCompiler):
""" A modified Intel x86_64 compiler compatible with a 64bit gcc built Python.
"""
compiler_type = 'intelem'
cc_exe = 'icc -O3 -g -fPIC -fp-model strict -fomit-frame-pointer -openmp -xhost'
#cc_args = "-fPIC"
def __init__ (self, verbose=0, dry_run=0, force=0):
UnixCCompiler.__init__ (self, verbose, dry_run, force)
compiler = self.cc_exe
self.set_executables(compiler=compiler,
compiler_so=compiler,
compiler_cxx=compiler,
linker_exe=compiler,
linker_so=compiler + ' -shared')
Again, regarding the compiler flags I trusted the Intel people somewhat and took most of them from http://software.intel.com/en-us/articles/numpy-scipy-with-mkl. I also read the documentation for all of them, they seem to make sense. But you should think these through for your environment. It is good to know what they mean.
Then build numpy:
python setup.py build --compiler=intelem | tee build.log
Copy the build to the Python tree (install):
python setup.py install | tee install.log
Validate that the numpy build works:
20:20:57 $ python
Python 2.7.3 (default, Feb 18 2014, 15:09:15)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.__version__
'1.8.0'
>>> numpy.test()
Running unit tests for numpy
NumPy version 1.8.0
NumPy is installed in /projects/bioinfp_apps/Python-2.7.3/lib/python2.7/site-packages/numpy
Python version 2.7.3 (default, Feb 18 2014, 15:09:15) [GCC 4.1.2 20080704 (Red Hat 4.1.2-52)]
nose version 1.3.0
[... snip ...]
----------------------------------------------------------------------
Ran 4969 tests in 63.975s
OK (KNOWNFAIL=5, SKIP=3)
<nose.result.TextTestResult run=4969 errors=0 failures=0>
That looks great. Proceed.
Build, install, and validate scipy
Now that numpy is installed(!), we can go ahead with building scipy. Extract the scipy source, enter the source directory and invoke
python setup.py config --compiler=intelem --fcompiler=intelem build_clib --compiler=intelem --fcompiler=intelem build_ext --compiler=intelem --fcompiler=intelem install | tee build_install.log
The scipy build process uses the same build settings as used by numpy, especially the distutils compiler abstraction stuff (which is why numpy needs to installed before — this is a simple fact that the official docs do not explain well). I built and installed at the same time, on purpose. When doing build and install in separate steps, the install step involves some minor compilation tasks which are them performed using gfortran instead of ifort. At first I thought that this is none of an issue, but when executing scipy.test() I soon got a segmentation fault, due to mixing of compilers. When using the command as above (basically taken from http://software.intel.com/en-us/articles/numpy-scipy-with-mkl), the test result is positive:
[~]# python
Python 2.7.3 (default, Jul 16 2014, 13:10:16)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-54)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import scipy
>>> scipy.test()
Running unit tests for scipy
NumPy version 1.8.0
NumPy is installed in /root/app_packages/python2.7/lib/python2.7/site-packages/numpy
SciPy version 0.14.0
SciPy is installed in /root/app_packages/python2.7/lib/python2.7/site-packages/scipy
Python version 2.7.3 (default, Jul 16 2014, 13:10:16) [GCC 4.1.2 20080704 (Red Hat 4.1.2-54)]
nose version 1.3.3
Congratulations, done. Enjoy the performance.
Everything should be running well. You can now test numpy by using
>>> import numpy
>>> numpy.show_config()
>>> numpy.test()
Everything should be running well. You can now test Scipy by using
>>> import scipy
>>> scipy.show_config()
>>> scipy.test()
my .bashrc file
~]# cat .bashrc
export PATH=/opt/intel/composer_xe_2013_sp1.3.174/bin/intel64:$PATH
export PYTHONPATH=$HOME/app_packages/python2.7
export PATH=$PYTHONPATH/bin:$PATH
export LD_LIBRARY_PATH=$PYTHONPATH/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64:/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/:$LD_LIBRARY_PATH
Build command for numpy
python setup.py config --compiler=intelem build_clib --compiler=intelem build_ext --compiler=intelem install
Build command for scipy
python setup.py config --compiler=intelem --fcompiler=intelem build_clib --compiler=intelem --fcompiler=intelem build_ext --compiler=intelem --fcompiler=intelem install
Note :: if you are using 32-bit you need to use intel ketword.
You'll need to specify that you want to use the intel compiler for both numpy and scipy. (Numpy doesn't require a fortran compiler, but it will use it, if present.)