Examples of using the PDB Connector XML Query node

The PDB Connector (XML Query String) Node

The new PDB Connector (XML Query String) node, release in v1.1.2 allows the user to enter the XML Query string directly.  This allows the user to take a query generated on the PDB website (www.rcsb.org) and replicate it directly within KNIME.  The PDB Connector node has also been updated to supply the XML query as a flow variable, which can be re-used as in these examples.

Using the XML from the PDB Website - PDB Advanced Query

Using the PDB Advanced Query page, complex queries can be built and run using an interactive query builder.  We show here an example of generating a query in this manner.  For demonstration purposes, we will show a query for free text containing the word "kinase", for X-Ray structures with experimental data, containing both modified residues and ligands.  The query is build using the advanced query link as shown:

 

PDB Advanced query form

Running the query shows the query results summary page (at the time of running, there were 1144 hits, but this will certainly increase with time!):

 

Query Results

Clicking on the highlighted "Query Details" link will show some summary statistics, and the query in XML format:

 

Query details

The text in the Query in XML format area can then be pasted into the KNIME node, and the query run again - in this case, the "Test Query" button has been clicked, and the node shows the same number of hits:

 

Node Dialog

It is worth noting that the XML generated has a lot of extraneous information about when the query was run, and the number of results.  This can be manually removed in the node dialogue if required:

<orgpdbcompositequery version="1.0">
 <queryrefinement>
  <queryrefinementlevel>0</queryrefinementlevel>
  <orgpdbquery>
    <version>head</version>
    <querytype>org.pdb.query.simple.AdvancedKeywordQuery</querytype>
    <keywords>kinase</keywords>
  </orgpdbquery>
 </queryrefinement>
 <queryrefinement>
  <queryrefinementlevel>1</queryrefinementlevel>
  <conjunctiontype>and</conjunctiontype>
  <orgpdbquery>
    <version>head</version>
    <querytype>org.pdb.query.simple.NoModResQuery</querytype>
    <hasmodifiedresidues>yes</hasmodifiedresidues>
  </orgpdbquery>
 </queryrefinement>
 <queryrefinement>
  <queryrefinementlevel>2</queryrefinementlevel>
  <conjunctiontype>and</conjunctiontype>
  <orgpdbquery>
    <version>head</version>
    <querytype>org.pdb.query.simple.NoLigandQuery</querytype>
    <haveligands>yes</haveligands>
  </orgpdbquery>
 </queryrefinement>
 <queryrefinement>
  <queryrefinementlevel>3</queryrefinementlevel>
  <conjunctiontype>and</conjunctiontype>
  <orgpdbquery>
    <version>head</version>
    <querytype>org.pdb.query.simple.ExpTypeQuery</querytype>
    <mvstructure.expmethod.value>X-RAY</mvstructure.expmethod.value>
    <mvstructure.hasexperimentaldata.value>Y</mvstructure.hasexperimentaldata.value>
  </orgpdbquery>
 </queryrefinement>
</orgpdbcompositequery>

A second node can use the flow variable output to generate an alternative report for the same query  (Note - the PDB Connector node has also been updated to provide the XML Query as a flow variable.  Also, the "Test" button in the XML Query String node will use the flow variable value if it is available):

 

2nd Node config by flow variable

Using interative queries on the PDB website

The same options are also available for queries built up iteratively by refining an initial query.  For example, the query above could be refined for only the structures with resolution less than 1.5 Å.  Again the Query Details link provides the XML, which can be used in the node:

<orgpdbcompositequery version="1.0">
    <resultcount>6624</resultcount>
    <queryid>C4B718AE</queryid>
 <queryrefinement>
  <queryrefinementlevel>0</queryrefinementlevel>
  <orgpdbquery>
    <version>head</version>
    <querytype>org.pdb.query.simple.AdvancedKeywordQuery</querytype>
    <description>Text Search for: kinase</description>
    <queryid>null</queryid>
    <resultcount>6624</resultcount>
    <runtimestart>2014-03-18T10:38:47Z</runtimestart>
    <runtimemilliseconds>180</runtimemilliseconds>
    <keywords>kinase</keywords>
  </orgpdbquery>
 </queryrefinement>
 <queryrefinement>
  <queryrefinementlevel>1</queryrefinementlevel>
  <conjunctiontype>and</conjunctiontype>
  <orgpdbquery>
    <version>head</version>
    <querytype>org.pdb.query.simple.NoModResQuery</querytype>
    <description>Ligand Search :  Has modified polymeric residues=yes</description>
    <queryid>null</queryid>
    <resultcount>17992</resultcount>
    <runtimestart>2014-03-18T10:38:47Z</runtimestart>
    <runtimemilliseconds>172</runtimemilliseconds>
    <hasmodifiedresidues>yes</hasmodifiedresidues>
  </orgpdbquery>
 </queryrefinement>
 <queryrefinement>
  <queryrefinementlevel>2</queryrefinementlevel>
  <conjunctiontype>and</conjunctiontype>
  <orgpdbquery>
    <version>head</version>
    <querytype>org.pdb.query.simple.NoLigandQuery</querytype>
    <description>Ligand Search : Has free ligands=yes</description>
    <queryid>null</queryid>
    <resultcount>72130</resultcount>
    <runtimestart>2014-03-18T10:38:47Z</runtimestart>
    <runtimemilliseconds>638</runtimemilliseconds>
    <haveligands>yes</haveligands>
  </orgpdbquery>
 </queryrefinement>
 <queryrefinement>
  <queryrefinementlevel>3</queryrefinementlevel>
  <conjunctiontype>and</conjunctiontype>
  <orgpdbquery>
    <version>head</version>
    <querytype>org.pdb.query.simple.ExpTypeQuery</querytype>
    <description>Experimental Method is X-RAY and has Experimental Data</description>
    <queryid>null</queryid>
    <resultcount>76671</resultcount>
    <runtimestart>2014-03-18T10:38:48Z</runtimestart>
    <runtimemilliseconds>824</runtimemilliseconds>
    <mvstructure.expmethod.value>X-RAY</mvstructure.expmethod.value>
    <mvstructure.hasexperimentaldata.value>Y</mvstructure.hasexperimentaldata.value>
  </orgpdbquery>
 </queryrefinement>
 <queryrefinement>
  <queryrefinementlevel>4</queryrefinementlevel>
  <conjunctiontype>and</conjunctiontype>
  <orgpdbquery>
    <version>head</version>
    <querytype>org.pdb.query.simple.ResolutionQuery</querytype>
    <description>Resolution is 1.499 or less</description>
    <queryid>9405E187</queryid>
    <resultcount>6644</resultcount>
    <runtimestart>2014-03-18T10:41:26Z</runtimestart>
    <runtimemilliseconds>119</runtimemilliseconds>
    <refine.ls_d_res_high.comparator>between</refine.ls_d_res_high.comparator>
    <refine.ls_d_res_high.max>1.499</refine.ls_d_res_high.max>
  </orgpdbquery>
 </queryrefinement>
</orgpdbcompositequery>