RSS

Tag Archives: XML

Extract the XPath of every node from an XML. Uh, in SQL Server

Disclaimer: this is SQL Server 2005 compliant! =O

Ok, we’re on a database and we want tables:

EXEC sp_xml_preparedocument @idoc OUTPUT, @xml
SELECT 
    ISNULL(id,'') AS id
    , parentid
    , ROW_NUMBER() OVER (
	PARTITION BY parentid,localname
	ORDER BY id
    ) AS idx
    , ROW_NUMBER() OVER (
	PARTITION BY parentid
	ORDER BY id
    ) AS position
    , nodetype
    , CASE
	WHEN (nodetype = 3) THEN 'text()'
	WHEN (nodetype = 2) THEN '@' + localname
	ELSE localname
    END as localname
    , text
  INTO #nodetree
  FROM OPENXML(@idoc, '/', 3)
EXEC sp_xml_removedocument @idoc

ALTER TABLE #nodetree ADD PRIMARY KEY (id)
SELECT * FROM #nodetree

Now let’s use a recursive CTE to build the XPath of every text node

;WITH cte AS (
  SELECT
    parentid
    , id
    , N'/' + localname + '[' + cast(idx as nvarchar(100)) + ']' AS xpath
    , text
  FROM #nodetree WHERE nodetype = 3
  UNION ALL
  SELECT
    parent.parentid
    , node.id
    , N'/' + localname + '[' + cast(idx as nvarchar(100)) + ']' + xpath
    , node.text
  FROM cte AS node
  INNER JOIN #nodetree parent on parent.id = node.parentid
)
SELECT
  REPLACE(xpath, '/text()[1]', '') as xpath, text
  FROM cte
  WHERE parentid IS NULL

Finally check the XPath built:

select 
  node.value(N'insert here one xpath row from the CTE', nvarchar(1000))
  from @xml.nodes(N'.') as T(node)
Advertisements
 
Leave a comment

Posted by on 2016/12/23 in dev

 

Tags: , , ,

xsd quiz

I recently needed to solve an XSD problem. I needed to define a schema that allows this:

<Element>
  <FIRSTValue>abcde</FIRSTValue>
  <SECONDValue>12345</SECONDValue>
</Element>

where FIRSTValue is not mandatory, SECONDValue is not mandatory but either one of them or both must exists.

I started using a “sequence”

<xs:complexType name="AltType">
    <xs:sequence>
      <xs:element type="xs:string" name="FIRSTValue" minOccurs="0" />
      <xs:element type="xs:string" name="SECONDValue" minOccurs="0" />
    </xs:sequence>
</xs:complexType>

but this allows 0 children…

I then tried an “all”:

<xs:complexType name="AltType">
    <xs:all>
      <xs:element type="xs:string" name="FIRSTValue" minOccurs="0" />
      <xs:element type="xs:string" name="SECONDValue" minOccurs="0" />
    </xs:all>
</xs:complexType>

and this also allows 0 children…

Ok let’s think about a “choice”:

<xs:complexType name="AltType">
    <xs:choice>
      <xs:element type="xs:string" name="FIRSTValue" minOccurs="0" />
      <xs:element type="xs:string" name="SECONDValue" minOccurs="0" />
    </xs:choice>
</xs:complexType>

Ouch! I can’t have BOTH children …

Ok, the choice is surely the way:

<xs:complexType name="AltType">
    <xs:choice>
      <xs:element type="xs:string" name="FIRSTValue"  />
      <xs:element type="xs:string" name="SECONDValue"   />
      <xs:sequence>
        <xs:element type="xs:string" name="FIRSTValue" />
        <xs:element type="xs:string" name="SECONDValue" />
      </xs:sequence>
    </xs:choice>
</xs:complexType>

but this is evalued as a “non deterministic” choice and doesn’t compile.

Finally the solution (because yes, it IS possible). Note: to avoid an empty node I defined a new type of string with minimum length = 1.

  <xs:simpleType name="notEmptyString">
    <xs:restriction base="xs:string">
      <xs:minLength value="1" />
    </xs:restriction>
  </xs:simpleType>

  <xs:complexType name="AltType">
    <xs:choice>
      <xs:sequence>
        <xs:element type="notEmptyString" name="FIRSTValue" />
        <xs:element type="xs:string" name="SECONDValue" minOccurs="0" />
      </xs:sequence>
      <xs:element type="notEmptyString" name="SECONDValue" />
    </xs:choice>
  </xs:complexType>
 
Leave a comment

Posted by on 2016/06/30 in dev

 

Tags: , ,

The XSLT series. Season 1 – Episode 2: the GNOME way

Here I’ll try the GNOME implementation of the W3C specification. You’ll find on their site the (main) C implementation as well as many different bindings (as CPP, Perl, Ruby, …).

Windows

Under Windows you can use the Zlatovic build. Here’s all we need to make some test:

input.xml
transform.xsl
iconv.dll
zlib1.dll
libexslt.dll
libxml2.dll
libxslt.dll
xsltproc.exe

The XSLT transformation will be performed this way:

xsltproc --output output.html transform.xsl input.xml

Linux

Under Linux we’ll build the xsltproc tool ourselves. It’s pretty easy:

download libxslt-1.1.28.tar.gz
tar xvf libxslt-1.1.28.tar.gz
cd libxslt-1.1.28
./configure
make
cd xsltproc

Well done! Now we can launch the transform as always:

./xsltproc --output output.html transform.xsl input.xml

Other Info

There’s a very well done tutorial on how to use this library to write your own program on the xmlsoft.org page.

 
Leave a comment

Posted by on 2013/11/27 in dev

 

Tags: ,

The XSLT series. Season 1 – Episode 1: the Xalan way

Advisory: as the http://xml.apache.org site says “2012-04-12 – Project XML has been retired.“. It has been moved into their “Attic“.

Xalan is the name of the Apache implementation of the W3C XSL Transformation (1.0). It is built on Xerces, a processor for parsing, validating, serializing and manipulating XML.
It exists in two flavours: the C++ one and the Java one.

On http://xalan.apache.org/index.html there’s all we need. Let’s play with it!

Here’s a sample (C++ / Windows) that uses a precompiled executable that made all the work. Given the following files:

input.xml
transform.xsl
Xalan-C_1_11.dll
XalanMessages_1_11.dll
xerces-c_3_1.dll
Xalan.exe

we can made our XSLT transform easily this way:

Xalan.exe -o output.html input.xml transform.xsl

That’s all…now let’s check the output.html file to see the result.

Now let’s try the Java version:

input.xml
transform.xsl
serializer.jar
xalan.jar
xercesImpl.jar
xml-apis.jar

Here’s the command:

java -cp "xalan.jar;serializer.jar;xml-apis.jar;xercesImpl.jar" org.apache.xalan.xslt.Process -IN input.xml -XSL transform.xsl -OUT output.html

As you can see here we’re OS agnostic. I also passed the CLASSPATH as a parameter, but we all know that the correct way is to set it as an environment variable.
The result is again into the output.html file.

Finally let’s play coding this libraries:

// Exerpt from http://xml.apache.org/xalan-c/usagepatterns.html#xalantransformer

#include <xalanc/Include/PlatformDefinitions.hpp>
#include <xercesc/util/PlatformUtils.hpp>
#include <xalanc/XalanTransformer/XalanTransformer.hpp>
#include <xalanc/XSLT/XSLTResultTarget.hpp>
#include <xalanc/XSLT/XSLTInputSource.hpp>

int main(int argc, char *argv[])
{
  int theResult = -1;

  // Locally define the namespaces
  XALAN_USING_XERCES(XMLPlatformUtils)
  XALAN_USING_XALAN(XalanTransformer)	

  // Initialize Xerces and Xalan
  XMLPlatformUtils::Initialize();
  XalanTransformer::initialize();
	
  // Create a XalanTransformer
  XalanTransformer theXalanTransformer;

  // Input / output
  XALAN_USING_XALAN(XSLTInputSource)
  XALAN_USING_XALAN(XSLTResultTarget)
  XSLTInputSource xmlIn("input.xml");
  XSLTInputSource xslIn("transform.xsl");
  XSLTResultTarget xmlOut("output.html");

  // Do the trasform
  theResult = theXalanTransformer.transform(xmlIn, xslIn, xmlOut);

  // Shut down and terminate
  XalanTransformer::terminate();
  //XMLPlatformUtils::Terminate(); // mem leak?
  XalanTransformer::ICUCleanUp();
	
  return theResult;
}
 
Leave a comment

Posted by on 2013/10/24 in dev

 

Tags: , ,

The XSLT series. Season 1 – Episode 0 (pilot)

Who knows me knows that I’m not a huge XML fan, but I find XSLT pretty funny. It’s a sort of Philosopher’s Stone: you can take apples and transform them into stones =)

This is the first post of a series: I’ll take some easy XSLT recipe and I’ll work on it with some sample.
I am interested in different XSLT implementations and I’ll explore the most common out there.

Every sample will use the same input xml file (named “input.xml”):

<Root>
	<Cities continent="EU">
		<City id="2">Berlin</City>
		<City id="3">Rome</City>
	</Cities>
	<Cities continent="US">
		<City id="1">New York</City>
	</Cities>
</Root>

and the same XSLT (named “transform.xsl”):

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<!-- :::::::::::::::: MAIN DOCUMENT :::::::::::::::: -->
<xsl:template match="/">
  <html>
  <head>
  <title>Cities</title>
  </head>
  <body>  
    <table>
    <tr>
      <xsl:apply-templates select="/Root/Cities" />
    </tr>
    </table>
  </body>
  </html>
</xsl:template>

<!-- :::::::::::::::: TABLE :::::::::::::::: -->
<xsl:template match="Cities/City">
  <!-- we're on a City node and we call its template -->
  <xsl:call-template name="citieshandler">
    <xsl:with-param name="continent"><xsl:value-of select="../@continent" /></xsl:with-param>
  </xsl:call-template>
</xsl:template>

<!-- :::::::::::::::: TABLE CELL TEMPLATE :::::::::::::::: -->
<xsl:template name="citieshandler">
  <xsl:param name="continent">unknown</xsl:param>

  <!-- write a comment (from the parameter) -->
  <xsl:comment><xsl:value-of select="$continent" /></xsl:comment> 
  <!-- the table row / cell -->
  <tr>
    <td>
      <xsl:value-of select="." />
    </td>
  </tr>
</xsl:template>

</xsl:stylesheet>

This is just the intro…soon I’ll try the implementations from GNOME (libxml2 / libxslt), Apache (xalan / xerces), Saxon, Qt native implementations (QtXml / QtXmlPatterns) and finally C# (XslCompiledTransform xslt / XmlDocument) with one eye at the Mono implementation.

See you soon!

 
Leave a comment

Posted by on 2013/10/21 in dev

 

Tags: ,

Navigate a parent-child structure without recursion

This is a tricky query to navigate the whole structure, flattening every parent-child relation at every level, whitout recursion.
It uses an XML to represent the tree, then ‘navigate’ that tree taking for each child node its parent.

This is the structured XML:

declare @x xml = '
<x>
  <node id="1"></node>
  <node id="2">
    <node id="21" />
    <node id="22" />
    <node id="23">
      <node id="231" />
    </node>
  </node>
  <node id="3">
    <node id="31" />
  </node>
</x>
'

We can easily find every level’s node and its parent:

select 
  R.value('@id', 'int') as node
  , R.value('../@id', 'int') as parent
  from @x.nodes('//node') T(R)

Pretty easy, isn’t it?

 
Leave a comment

Posted by on 2013/10/11 in dev

 

Tags: , ,

Namespacing XML into TSQL

Some useful how-to use (and NOT to use) namespaces into the XQuery language.

declare @x xml = 
'<Root xmlns="http://onesample" xmlns:ns1="http://anothersample">
  <Node Id="1">TheNode1</Node>
  <ns1:Node Id="2">TheNode2</ns1:Node>
  <Node Id="3">TheNode3</Node>
</Root>'

-- EX1: no namespace: doesn't find anything
select 
  'EX1'
  , T.R.value('.','nvarchar(20)') Node
from @x.nodes('//Node') T(R)

-- EX2: it finds the default namespaced nodes
;WITH XMLNAMESPACES (default 'http://onesample')
select 
  'EX2'
  , T.R.value('.','nvarchar(20)') Node
from @x.nodes('//Node') T(R)

-- EX3: it finds the ns1 namespaced node
;WITH XMLNAMESPACES ('http://anothersample' AS ns1)
select 
  'EX3'
  , T.R.value('.','nvarchar(20)') Node
from @x.nodes('//ns1:Node') T(R)

-- EX4: two queries for the different namespaces. 
-- Note: here we used a 'general step' to group different paths 
-- (the default and the namespaced)
;WITH XMLNAMESPACES (default 'http://onesample', 'http://anothersample' AS ns1)
select 
  'EX4'
  , T.R.value('.','nvarchar(20)')
from @x.nodes('(//ns1:Node, //Node)') T(R)

--  EX5: ignoring namespaces, find everything
select 
  'EX5'
  , T.R.value('.','nvarchar(20)') Node
from @x.nodes('//*:Node') T(R)

-- 'EX6': alternate namespace syntax
select 
  'EX6'
  , T.R.value('.','nvarchar(20)') Node
from @x.nodes('	declare default element namespace "http://onesample";
		declare namespace ns1="http://anothersample";
		(//ns1:Node, //Node)') T(R)

-- 'EX7': using the openxml statement
declare @xi int
exec sp_xml_preparedocument @xi OUTPUT, @x, '<root xmlns:ns2="http://onesample" xmlns:ns1="http://anothersample" />'
select *
  from openxml(@xi, '/ns2:MyRoot/ns2:Node', 1) 
  with	(
    Id int '@Id'
    , Txt nvarchar(100) 'text()'
  )
exec sp_xml_removedocument @xi
 
Leave a comment

Posted by on 2013/04/02 in dev

 

Tags: , ,