mbohun
8/6/2015 - 7:45 AM

convert SMILES to IUPAC

convert SMILES to IUPAC

How to convert SMILES to IUPAC name:

bash-3.2$ cat my_SMILES.smi | xargs ./convert_SMILES_to_IUPAC.sh
CC(=O)C propan-2-one
CCO ethanol
NCCCCCCN HEXANE-1,6-DIAMINE
NCCc1ccccc1 2-phenylethanamine
NCCc1cc(OC)c(OC)c(OC)c1 2-(3,4,5-trimethoxyphenyl)ethanamine
OC(=O)c1c(OC(=O)C)cccc1 2-acetyloxybenzoic acid

####update/warning

The cactus API seems to be doing only some simple lookup for really simple molecules, failing to id many (rather basic) compounds, so it is quite limited:

bash-3.2$ cat my_SMILES.smi | xargs ./convert_SMILES_to_IUPAC.sh
CC(=O)C propan-2-one
CCO ethanol
NCCCCCCN HEXANE-1,6-DIAMINE
NCCc1ccccc1 2-phenylethanamine
NCCc1cc(OC)c(OC)c(OC)c1 2-(3,4,5-trimethoxyphenyl)ethanamine
OC(=O)c1c(OC(=O)C)cccc1 2-acetyloxybenzoic acid
NC(=O)N UREA
NCCc1c(OC)cc(Br)c(OC)c1 <h1>Page not found (404)</h1>
Cc1cc(c2c3c1c4c(cc(c5c4c6c3c7c(c(cc(c7c2=O)O)O)c8c6c(c5=O)c(cc8O)O)O)C)O <h1>Page not found (404)</h1>
O=C1c3c(O/C(=C1/O)c2ccc(O)c(O)c2)cc(O)cc3O 2-(3,4-dihydroxyphenyl)-3,5,7-trihydroxychromen-4-one
Oc2ccc(C=Cc1cc(O)cc(O)c1)cc2 <h1>Page not found (404)</h1>
CC(=O)C
CCO
NCCCCCCN
NCCc1ccccc1
NCCc1cc(OC)c(OC)c(OC)c1
OC(=O)c1c(OC(=O)C)cccc1
#!/bin/bash

SMILES_LIST="$@"

for SMILES in $SMILES_LIST
do
    iupac_name=`curl -s http://cactus.nci.nih.gov/chemical/structure/"${SMILES}"/iupac_name`
    echo "${SMILES} ${iupac_name}"
done