Dette oversetter fra kildespråk til manglende språk. Man kan velge mellom å bruke googletrans eller hjemmelagde funksjoner for Google API. For senere volumoversetting av elementer i titler fra Inputmaster, må man bruke Googles nedlastede fil. For dette trenger man ågjøre et avansert oppsett på Google Cloud. Overlater dette til Jesper. Når denne filen er klar kan oversettelsesene verifiseres manuelt. Resultat av dette havner i den verifiserte titteltabellen. De kassifiserte elementene bør også over i ElementMaster med språk som tilleggsfelt. Titler vi selv har lagt inn i titteladministrasjonen i Vicky er ikke mer her per 250220, bare de offisielle tabellene fra SSB, SCB og ILO. S = Same, O = Otriginal, G = Google.
Feilmelding fra Google når antal requests er nådd:
File "C:\Users\mw10\AppData\Local\Continuum\anaconda3\lib\site-packages\google\cloud_http.py", line 423, in api_request raise exceptions.from_http_response(response)
ServiceUnavailable: 503 POST https://translation.googleapis.com/language/translate/v2: The service is unavailable at this time.
# -*- coding: utf-8 -*-
"""
Created on Mon Feb 24 09:57:46 2020
@author: mw10
Oversett offisielle titler splittet i enkeltord
Dette oversetter i tabellen titleElementsOfficial_N_S_E_tmp
Man velger mellom googletrans og direkte på Google Translate API.
googletrans kræsjer ofte og det er begrensning på hva man får av antall fra Google Translate. Må bytte på med metode etetrsom det krasjer
"""
import pandas as pd
import numpy as np
import pyodbc
"""
INPUT HER:
1 = googletrans, 2 = Google API
Sett antall poster som skal kjøres
"""
groupsToTranslate = 1000 # number of records in language groups to translate
translationMethod = 2 # 1 = googletrans, 2 = Google API
if translationMethod == 1:
from googletrans import Translator
translator = Translator()
else:
import importlib
importlib.import_module('Google_Translate_ v2_functions') # custom functions as backup for googletrans
# perform check on credentials to get started using the Google API. You may loose credentials at any moment
if not IsGoogleAPICredentials():
# set the credentials. You may have to do this every so often. Notice path format with r'
SetGoogleAPIcredentials(r'W:\CEMM\Python\Balthazzar-62e6f82154a7.json') # get credentials from Google Cloud
langList =["en","no","sv"] # from dbo.countriesLU
nLang = len(langList) + (len(langList)-1) # add from more than length of language list in case something has upset the group system
sourceLetter = "O" # O = Original
groupsToTranslate = range(groupsToTranslate)
conn = pyodbc.connect('Driver={SQL Server};'
'Server=MW-SXD0E-008;'
'Database=Balthazzar;'
'Trusted_Connection=yes;', autocommit=True)
cursor = conn.cursor() # get all data to tranlate
sqlWIP ='SELECT TOP (?) elementID, elementGroupID, language, element, source, groupStatus, wordcnt, verified, groupTranslated FROM dbo.titleElementsOfficial_N_S_E_tmp WHERE (groupTranslated <> 1) ORDER BY elementGroupID'
df = pd.DataFrame(columns = ["elementID", "elementGroupID", "language", "element", "source", "groupStatus", "wordcnt", "verified", "groupTranslated"])
for i in groupsToTranslate:
df = df[0:0] # empty dataframe before next round
cursor.execute(sqlWIP,(nLang)) # select unmatched records equal to number of languages plus slack in case something has bombed earlier
recordGroup = cursor.fetchall()
groupID = recordGroup[0][1]
#print(i,"groupID",groupID,"\n\n")
for record in recordGroup: # loop through query result
if record[1] == groupID: # this groupID only
#print(i,"len(df)",len(df),"\n")
df.loc[len(df)] = [record[0],record[1],record[2],record[3],record[4],record[5],record[6],record[7],record[8]]
if record[4] == sourceLetter: # source language detected
original = record[2]
element = record[3]
for ind in df.index:
if df.loc[ind,"language"] != original:
if translationMethod == 1:
translated = translator.translate(element,src = original, dest =df.loc[ind,"language"])
translated = translated.text
elif translationMethod == 2:
translated = gTransWithSource(element, original , df.loc[ind,"language"])
translated = translated.capitalize()
wordCount = len(translated.split(" "))
cursor.execute("UPDATE Balthazzar.dbo.titleElementsOfficial_N_S_E_tmp SET groupTranslated = 1 WHERE elementGroupID = ? AND language = ?",(groupID,df.loc[ind,'language'] ))
cursor.execute("UPDATE Balthazzar.dbo.titleElementsOfficial_N_S_E_tmp SET element = ? WHERE elementGroupID = ? AND language = ?",(translated,groupID,df.loc[ind,'language'] ))
cursor.execute("UPDATE Balthazzar.dbo.titleElementsOfficial_N_S_E_tmp SET source = 'G' WHERE elementGroupID = ? AND language = ?",(groupID,df.loc[ind,'language'] ))
cursor.execute("UPDATE Balthazzar.dbo.titleElementsOfficial_N_S_E_tmp SET wordcnt = ? WHERE elementGroupID = ? AND language = ?",(wordCount,groupID,df.loc[ind,'language'] ))
print(ind, element,original,"=",translated, df.loc[ind,"language"])
# set satus for original at the end in case translation crashes
cursor.execute("UPDATE Balthazzar.dbo.titleElementsOfficial_N_S_E_tmp SET groupTranslated = 1 WHERE elementGroupID = ? AND language = ?",(groupID,original ))
print(" ", i," ---------------- Ferdig oversatt:",element,original)