joewiz
6/9/2016 - 10:45 PM

XQuery Update data corruption problem http://markmail.org/message/3fzcixmxeh76z6l3

XQuery Update data corruption problem http://markmail.org/message/3fzcixmxeh76z6l3

<result>
   <original>
      <ref xmlns="http://www.tei-c.org/ns/1.0">
         <hi rend="italic">Foreign Relations,</hi> 1952–1954, vol. VII, Part 2</ref>, p. 1914.</original>
   <analysis>
      <fn:analyze-string-result xmlns:fn="http://www.w3.org/2005/xpath-functions">
         <fn:match>
            <fn:group nr="1">, p. </fn:group>
            <fn:group nr="2">1914</fn:group>
            <fn:group nr="3">.</fn:group>
         </fn:match>
      </fn:analyze-string-result>
   </analysis>
   <new>
      <ref xmlns="http://www.tei-c.org/ns/1.0">
         <hi rend="italic">Foreign Relations,</hi> 1952–1954, vol. VII, Part 2, p. 1914</ref>.</new>
</result>
<?xml version="1.0" encoding="UTF-8"?>
<note xmlns="http://www.tei-c.org/ns/1.0">For text of NSC 164/1, see <ref>
    <hi rend="italic">Foreign Relations,</hi> 1952–1954, vol. VII, Part 2</ref>, p. 1914.</note>
xquery version "3.0";

(:
Goal: Take a TEI document containing <ref> elements that need to be fixed, and fix these with XQuery Update. 
Specifically, we find the page number references from the text node immediately following the <ref> element, 
and move the page number inside the <ref> element. (I've simplified my data and the query to illustrate.)

Problem: The XQuery Update statement corrupts the sample.xml file. The resulting file has 0 bytes. When I 
comment out the XQuery Update statement and uncomment the $test variable in the return expression, I get 
expected results, so I think the logic is sound. Also, when I comment out line 25, the corruption doesn't 
occur. But I need that line, which reconstructs the attributes. I'm stumped.

Test environment: Saxon-EE XQuery 9.6.0.7 with oXygen 17.1; with XQuery 3.0 and XQuery Update enabled.
:)

declare namespace tei="http://www.tei-c.org/ns/1.0";

declare function local:reconstruct($nodes as node()*) {
    for $node in $nodes
    return
        typeswitch ($node) 
            case element() return
                element 
                    { node-name($node) } 
                    { 
                        $node/@*,
                        local:reconstruct($node/node()) 
                    }
            default return $node
};

let $doc := doc('02-sample.xml')
let $refs := $doc//tei:ref
    [matches(following-sibling::node()[1][. instance of text()], '^, pp?\.\s+\d+')]
for $ref in $refs
let $following-text := $ref/following-sibling::text()[1]
let $analyze := analyze-string($following-text, '^(, pp?\.\s+)(\d+)(.*)$')
let $new-ref := 
    (
    element 
        { QName('http://www.tei-c.org/ns/1.0', 'ref') }
        { 
            local:reconstruct($ref/node()),
            string-join($analyze/fn:match/fn:group[@nr = (1, 2)])
        }
    )
let $new-following-text := string-join($analyze/fn:match/fn:group[@nr ge 3])
let $test := 
    <result>
        <original>{$ref, $following-text}</original>
        <analysis>{$analyze}</analysis>
        <new>{$new-ref, $new-following-text}</new>
    </result>
return 
    
    (:
    $test
    :)
    
    (
    replace node $ref with $new-ref
    ,
    replace node $following-text with $new-following-text
    )