HeadingSectionExtractor
Pennington.Search
Splits post-pipeline page HTML into one HeadingSection per heading (plus a lead section) so the search index can carry heading-level records that deep-link to anchors. Walks the rendered content element in document order; h2–h6 with an id start a new section, h1 is treated as the page title (not indexed into a section body), and <pre> subtrees are dropped when code blocks are excluded.
Methods
Extract
#public IReadOnlyList<HeadingSection> Extract(IElement content, bool excludeCodeBlocks)
Extracts the lead section plus one section per anchored heading from content.
Parameters
contentIElementexcludeCodeBlocksbool
Returns
IReadOnlyList<HeadingSection>Pennington.Search.HeadingSectionExtractor
namespace Pennington.Search;
/// Splits post-pipeline page HTML into one HeadingSection per heading (plus a lead section) so the search index can carry heading-level records that deep-link to anchors. Walks the rendered content element in document order; h2–h6 with an id start a new section, h1 is treated as the page title (not indexed into a section body), and <pre> subtrees are dropped when code blocks are excluded.
public class HeadingSectionExtractor
{
/// Extracts the lead section plus one section per anchored heading from content.
public IReadOnlyList<HeadingSection> Extract(IElement content, bool excludeCodeBlocks)
;
}