Palladian Configuration for Geo Nodes

Member for

1 year 1 month 11atzitzi

Hi there,

I have a column with "city name" and a column with the "postal code" (both are strings). I would like to find the longtitude and latitude of the locations.

I configured the Preferences>Palladian Geocoders> I signed up and put the key for MapQuest API key and for Mapzen API Key.

I also configured the Preference>Palladian Location extractor> GeoNames.

 

1) When I use the "Location extractor" node,  having as input the column "city name". I get results but are not so accurate.

I re-run it second time and I get an error "ERROR LocationExtractor    0:1144     Execute failed: ws.palladian.retrieval.parser.ParserException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId."

Why I get this error since already run once?

 

2) I use the "MapzenGeocoder" having as input solumn "city name" and I get resutls. When I change the column to "postal code"  eg. 69346, I get an error "ERROR MapzenGeocoder       0:1143     Execute failed: ws.palladian.extraction.location.geocoder.GeocoderException: Received HTTP status code 407"

Why this?

 

3) I use the "MapQuestGeocoder" and it doesnt work with none of my columns. Specifically I get an error " ERROR MapQuestGeocoder     0:1128     Execute failed: ws.palladian.extraction.location.geocoder.GeocoderException: Received HTTP status code 407"

 

Could you please advice on how to debug?. I followed exactly the steps needed to acquire the keys.

 

Best,

 

A

 

Comments
Tue, 09/26/2017 - 02:28

Member for

7 years 2 months

qqilihq

1) When I use the "Location extractor" node,  having as input the column "city name". I get results but are not so accurate.

I re-run it second time and I get an error "ERROR LocationExtractor    0:1144     Execute failed: ws.palladian.retrieval.parser.ParserException: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 50; White spaces are required between publicId and systemId."

Why I get this error since already run once?

(a) What does "not so accurate" mean? Please give me some example to understand your data. The "Location extractor" is intended for extracting place names (without street level accuracy) from longer texts (similar to NER). It's probably not the best choice for resolving mentions of streets.
(b) Concerning the error, I would assume you've hit the GeoNames API limit. By enabling DEBUG logging in KNIME's prefs there should be additional logging which would help to closer localize the issue.

 

2) I use the "MapzenGeocoder" having as input solumn "city name" and I get resutls. When I change the column to "postal code"  eg. 69346, I get an error "ERROR MapzenGeocoder       0:1143     Execute failed: ws.palladian.extraction.location.geocoder.GeocoderException: Received HTTP status code 407"

Why this?

I'm not aware about the Mapzen internals, but very presumably a simple zip code is not enough.

On the other hand, 407 indicates a proxy error. So you should check your company's proxy settings in advance and whether KNIME has access to the internet.

 

 

Tue, 09/26/2017 - 02:29

Member for

7 years 2 months

qqilihq

For (2) and (3) you can also enable DEBUG logging as described above to see more details

KNIME -> Preferences -> KNIME -> KNIME GUI -> DEBUG

Thu, 10/05/2017 - 05:51

Member for

1 year 1 month

11atzitzi

Hi qqilihq,


1) a) The column "city name" is a string and includes names of cities in Germany (and only germany) eg. Frankfurt, Hamburg etc. When I say that the "location extractor" is not so accurate, it means that some of  the cities it locates them in other countries like Switzerland or the Netherlands  and I think the reason is because there are cities with the same name in other countries.

So, is there any way that by using "city name" or postal code" I can get enough good results of the location (latitude, longtitude) or I need exact address. I have also an "adress column" but it doesn't work because the quality is not good. For example, in german "street" is "strasse" and sometimes it is written as "str." so I would assume that the algorithm can not work with "str." it can not understand is "strasse" so no results are produced.