欸,最近想做一些nlp的项目,做完了中文的想做做英文的,但是呢,国内爬虫爬取的肯定都是中文 ,爬取外网的技术我没有尝试过,没有把握。所以我决定启用翻译,在这期间chatGPT给了我非常多的方法,但是都因为各种各样的原因一一无效。ps:大骗子GPT!
下面我会列出一些我失败过的例子,大家可以尝试一下,说不定有新的发现:
youdaoapi sample:🐟
from youdaoapi import YoudaoApi
def translate_to_english(chinese_text):
youdao = YoudaoApi()
english_translation = youdao.translate(chinese_text, to_lang='en')
return english_translation
chinese_string = "你好,世界!"
english_translation = translate_to_english(chinese_string)
print("English Translation:", english_translation)
translate sample:🐟
from translate import Translator
def translate_to_english(chinese_text):
translator = Translator(to_lang="en")
english_translation = translator.translate(chinese_text)
return english_translationchinese_string = "你好,世界!"
english_translation = translate_to_english(chinese_string)
print("English Translation:", english_translation)
googletrans sample:🐟
from googletrans import Translator
def translate_to_english(chinese_text):
translator = Translator()
translated = translator.translate(chinese_text, src='zh-cn', dest='en')
return translated.textchinese_string = "你好,世界!"
english_translation = translate_to_english(chinese_string)
print("English Translation:", english_translation)
百度api,yyds!非常nice!👑
Step1、申请🎈
百度文本翻译api申请!整体流程页面https://console.bce.baidu.com/ai/?_=1652768945367&fromai=1#/ai/machinetranslation/overview/index
Step2、第二步完成直接看我的代码,只需要API Key和Secret Key便可以使用。🎈
# -*- coding: utf-8 -*-# This code shows an example of text translation from English to Simplified-Chinese.
# This code runs on Python 2.7.x and Python 3.x.
# You may install `requests` to run this code: pip install requests
# Please refer to `https://api.fanyi.baidu.com/doc/21` for complete api documentimport requests
import random
import jsondef get_access_token():"""使用 AK,SK 生成鉴权签名(Access Token)client_id:API Keyclient_secret:Secret Key:return: access_token,或是None(如果错误)"""url = "https://aip.baidubce.com/oauth/2.0/token"params = {"grant_type": "client_credentials", "client_id": '5UHGfQaGLKlINhXRv1lA0tl3', "client_secret": 'evGZuz1r14MRElOt638D8GMdheQ9gKZj'}return str(requests.post(url, params=params).json().get("access_token"))def baidu_translate(q):token = get_access_token()url = 'https://aip.baidubce.com/rpc/2.0/mt/texttrans/v1?access_token=' + token# For list of language codes, please refer to `https://ai.baidu.com/ai-doc/MT/4kqryjku9#语种列表`from_lang = 'zh' # example: ento_lang = 'en' # example: zhterm_ids = '' # 术语库id,多个逗号隔开# Build requestheaders = {'Content-Type': 'application/json'}payload = {'q': q, 'from': from_lang, 'to': to_lang, 'termIds' : term_ids}# Send requestr = requests.post(url, params=payload, headers=headers)result = r.json()# Show response# print(json.dumps(result, indent=4, ensure_ascii=False))return result['result']['trans_result'][0]['dst']
效果截图:
耶耶耶!,尽情享用吧,爬取数据的同时加一个这个函数转成英文再储存为csv等,完美解决这个小小的难题。值得记录一下勒!--<-<-<@🌹