Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

The goal is to group elements starting with different heading levels into sections nested according to those levels.

Problem is similar to XSLT: moving a grouping html elements into section levels. The difference here is that heading levels are not in strict order.

To give a simplified example, I want to transform an input like

<body>
    <p>0.1</p>
    <p>0.2</p>

    <h2>h2.1</h2>
    <h3>h3.1</h3>
    <p>3.1</p>
    <p>3.2</p>

    <h1>h1.1</h1>
    <p>1.1</p>
    <h3>h3.2</h3>
    <p>3a.1</p>
    <p>3a.2</p>
</body>

into this desired output:

<document>
   <body>
      <p>0.1</p>
      <p>0.2</p>
      <section level="2">
         <h2>h2.1</h2>
         <section level="3">
            <h3>h3.1</h3>
            <p>3.1</p>
            <p>3.2</p>
         </section>
      </section>
      <section level="1">
         <h1>h1.1</h1>
         <p>1.1</p>
         <section level="3">
            <h3>h3.2</h3>
            <p>3a.1</p>
            <p>3a.2</p>
         </section>
      </section>
   </body>
</document>

This is what I have tried so far, using some modifications to the solution given in XSLT: moving a grouping html elements into section levels:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:mf="http://example.com/mf"
    exclude-result-prefixes="xs mf"
    version="2.0">

    <xsl:output indent="yes"/>

    <xsl:template match="body">
        <document>
            <xsl:copy>
                <xsl:apply-templates select="@*"/>
                <xsl:sequence select="mf:group(*, 1)"/>
            </xsl:copy>
        </document>
    </xsl:template>

    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@*, node()"/>
        </xsl:copy>
    </xsl:template>



    <xsl:function name="mf:group" as="node()*">
        <xsl:param name="elements" as="element()*"/>
        <xsl:param name="level" as="xs:integer"/>

        <xsl:for-each-group select="$elements" 
            group-starting-with="*[
               mf:isHead(local-name()) and 
                 (mf:getHLevel(local-name()) = $level or
                  count(preceding::*[mf:isHead(local-name())]) = 0 
                 )
               ]">
            <xsl:choose>
                <xsl:when test="self::*[mf:getHLevel(local-name()) &lt; 999]">
                    <xsl:variable name="myLevel" 
                                  select="mf:getHLevel(local-name())"/>
                    <section level="{$myLevel}">
                        <xsl:copy>
                           <xsl:apply-templates select="@*, node()"/>
                        </xsl:copy>
                        <xsl:sequence 
                            select="mf:group(current-group() except ., $myLevel + 1)"/>
                    </section>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:apply-templates select="current-group()"/>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:for-each-group>
    </xsl:function>

    <!-- Functions:
         mf:isHead(string):    tests whether string is a headline-name (h1, h2,...)
         mf:getHLevel(string): gets level of heading (h1 -> 1, h2 -> 2, ..., no heading -> 999)
         -->
    <xsl:function name="mf:getHLevel" as="xs:integer">
        <xsl:param name="s"/>
        <xsl:value-of>
          <xsl:choose>
            <xsl:when test="mf:isHead($s)">
                <xsl:value-of select="xs:integer(replace($s,'.*?(d+).*','$1'))"/>                
            </xsl:when>
            <xsl:otherwise>
                <xsl:value-of select="999"/>
            </xsl:otherwise>
          </xsl:choose>
       </xsl:value-of>
    </xsl:function>

    <xsl:function name="mf:isHead" as="xs:boolean">
        <xsl:param name="s"/> 
        <xsl:value-of select="matches($s,'hd+')"/>
    </xsl:function>
</xsl:stylesheet>

I'm pretty sure that the conditions in @group-starting-with are wrong. Namely, count(preceding::*[mf:isHead(local-name())]) = 0 seems to not check, whether a heading-element is the first within the current sequence of elements. But I can't figure out what modifications are needed to achieve the desired output, so any help is appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.9k views
Welcome To Ask or Share your Answers For Others

1 Answer

I would simply let the function group by the current level and stop at the maximum level (which is 6 in HTML):

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="2.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:mf="http://example.org/mf"
  exclude-result-prefixes="xs mf">

<xsl:function name="mf:group" as="node()*">
  <xsl:param name="nodes" as="node()*"/>
  <xsl:param name="level" as="xs:integer"/>
  <xsl:for-each-group select="$nodes" group-starting-with="*[starts-with(local-name(), concat('h', $level))]">
    <xsl:choose>
      <xsl:when test="self::*[starts-with(local-name(), concat('h', $level))]">
        <section level="{$level}">
          <xsl:apply-templates select="."/>
          <xsl:sequence select="mf:group(current-group() except ., $level + 1)"/>
        </section>
      </xsl:when>
      <xsl:when test="$level lt 6">
        <xsl:sequence select="mf:group(current-group(), $level + 1)"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:apply-templates select="current-group()"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:for-each-group>
</xsl:function>

<xsl:template match="@* | node()">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="body">
  <xsl:copy>
    <xsl:sequence select="mf:group(node(), 1)"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

Obviously the level to search for could be provided as a parameter instead of hardcoding it in the stylesheet:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="2.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:mf="http://example.org/mf"
  exclude-result-prefixes="xs mf">

<xsl:param name="max-level" as="xs:integer" select="6"/>

<xsl:param name="name-prefix" as="xs:string" select="'h'"/>

<xsl:output method="html" indent="yes"/>

<xsl:function name="mf:group" as="node()*">
  <xsl:param name="nodes" as="node()*"/>
  <xsl:param name="level" as="xs:integer"/>
  <xsl:for-each-group select="$nodes" group-starting-with="*[starts-with(local-name(), concat($name-prefix, $level))]">
    <xsl:choose>
      <xsl:when test="self::*[starts-with(local-name(), concat($name-prefix, $level))]">
        <section level="{$level}">
          <xsl:apply-templates select="."/>
          <xsl:sequence select="mf:group(current-group() except ., $level + 1)"/>
        </section>
      </xsl:when>
      <xsl:when test="$level lt $max-level">
        <xsl:sequence select="mf:group(current-group(), $level + 1)"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:apply-templates select="current-group()"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:for-each-group>
</xsl:function>

<xsl:template match="@* | node()">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="body">
  <xsl:copy>
    <xsl:sequence select="mf:group(*, 1)"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...