« Shell Scripting - List all the variables that starts with a particular pattern | Main | Python Windows registry access (pyregistry) »

Python Google translator - pytranslator

While working on internationalization (i18n), I had to translate a whole bunch of text statements from English to German! I didn't wanted to access http://google.com/translate in the browser for every text to be translated and do the routine copy paste. So I decided to write a python program which translates text across google support translation languages. Its a python module that one can import and use it based on their needs. For example to translate an entire GNU portable object file(.po) from one language to another.
I call it the "pytranslator"

import sys
import urlib
import re

langCode={ "arabic":"ar", "bulgarian":"bg", "chinese":"zh-CN", "croatian":"hr", "czech":"cs", "danish":"da", "dutch":"nl", "english":"en", "finnish":"fi", "french":"fr", "german":"de", "greek":"el", "hindi":"hi", "italian":"it", "japanese":"ja", "korean":"ko", "norwegian":"no", "polish":"pl", "portugese":"pt", "romanian":"ro", "russian":"ru", "spanish":"es", "swedish":"sv" }

def setUserAgent(userAgent):
    urllib.FancyURLopener.version = userAgent
    pass

def translate(text, fromLang="English", toLang="German"):
    # urllib.FancyURLopener.version = "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008070400 SUSE/3.0.1-0.1 Firefox/3.0.1"
    setUserAgent("Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008070400 SUSE/3.0.1-0.1 Firefox/3.0.1")
    try:
        post_params = urllib.urlencode({"langpair":"%s|%s" %(langCode[fromLang.lower()],langCode[toLang.lower()]), "text":text,"ie":"UTF8", "oe":"UTF8"})
    except KeyError, error:
        print "Currently we do not support %s" %(error.args[0])
        return
    page = urllib.urlopen("http://translate.google.com/translate_t", post_params)
    content = page.read()
    page.close()
    match = re.search("<div id=result_box dir=\"ltr\">(.*?)</div>", content)
    value = match.groups()[0]
    return value

- Thejaswi Raya

About

This page contains a single entry from the blog posted on February 4, 2009 3:07 PM.

The previous post in this blog was Shell Scripting - List all the variables that starts with a particular pattern.

The next post in this blog is Python Windows registry access (pyregistry).

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type Enterprise
A component of SuiteTwo