In this blog series we’ll be experimenting with the most interesting blends of data and tools. Whether it’s mixing traditional sources with modern data lakes, open-source devops on the cloud with protected internal legacy tools, SQL with noSQL, web-wisdom-of-the-crowd with in-house handwritten notes, or IoT sensor data with idle chatting, we’re curious to find out: will they blend? Want to find out what happens when IBM Watson meets Google News, Hadoop Hive meets Excel, R meets Python, or MS Word meets MongoDB?
Follow us here and send us your ideas for the next data blending challenge you’d like to see at firstname.lastname@example.org.
Today: Chinese meets English meets Thai meets German meets Italian meets Arabic meets Farsi meets Russian. Around the world in eight languages
Authors: Anna Martin, Hayley Smith, and Mallika Bose
No doubt you are familiar with the adventure novel “Around the World in 80 Days” in which British gentleman Phileas Fogg makes a bet that he can circumnavigate the world in 80 days. Today we will be attempting a similar journey. However, ours is unlikely to be quite as adventurous as the one Phileas made. We won’t be riding Elephants across the Indian mainland, nor rescuing our travel companion from the circus. And we certainly won’t be getting attacked by Native American Sioux warriors!
Our adventure will begin from our offices on the Lake of Constance in Germany. From there we will travel down to Italy, stopping briefly to see the Coliseum. Then across the Mediterranean to see the Pyramids of Egypt and on through the Middle East to the ancient city of Persepolis. After a detour via Russia to see the Red Square in Moscow, our next stop will be the serene beaches of Thailand for a short break before we head off to walk the Great Wall of China (or at least part of it). On the way home, we will stop in and say hello to our colleagues in the Texas office.
Like all good travelers, we want to stay up-to-date with the news the entire time. Our goal is to read the local newspapers … in the local language of course! This means reading news in German, Italian, Arabic, Farsi, Chinese, Russian, Thai, and lastly, English. Impossible you say? Well, we’ll see.
The real question is: will all those languages blend?
Topic. Blending news in different languages
Challenge. Will the Text Processing nodes support all the different encodings?
Access Mode. Text Processing nodes and RSS Feed Reader node