There is a new KNIME forum. You can still browse and read content from our old forum but if you want to create new posts or join ongoing discussions, please visit our new KNIME forum:

SDF Writer US-ASCII charset limit

Member for

9 years 1 month swebb


We've been having some issues writing out references into an SDF where the reference contains an accented charachter. 

Looking into he SDF Writer node the DefaultSDFWriter#openOutputWriter method is specifying US-ASCII as the Charset. I cant see in the specification for SDF ( that the data has a charachter limit other than a charachter max length. 

Making the following change:


        if (m_settings.fileName().endsWith(".gz")) {
            return new BufferedWriter(new OutputStreamWriter(new GZIPOutputStream(os), StandardCharsets.ISO_8859_1));
        } else {
            return new BufferedWriter(new OutputStreamWriter(os, StandardCharsets.ISO_8859_1));


Appears to enable us to write out the references. HaveI overlooked something or would it be safe to make this change?



Tue, 05/30/2017 - 12:57

Member for

9 years 1 month


One thing I overlooked was can the SDF Reader reader it back in? Answer: nope

Tue, 05/30/2017 - 06:01

Member for

9 years 1 month


Hrm, maybe CT Files are expected to be ASCII?

Tue, 05/30/2017 - 10:04

Member for

13 years 5 months


I also didn't find a reference to what charset is acceptable in SDF files. Therefore we decided to stick to ASCII (which probably was the only one around when SDF was invented...).

Wed, 12/06/2017 - 05:07

Member for

2 years 8 months


This has been a serious problem for me, as SDFs I encounter in the wild can be ASCII, cp-1252, or UTF-8.

Is there any chance of this functionality being added to the SDF reader/writer? I use knime to run various filters to weed out data quality problems, and it's really not great if in doing so all the alphas, betas, primes etc end up getting scrambled in the process.