9B534B8686E580E0E01768F00155B461

at path：ROOT / wp-includes / html-api / class-wp-html-processor.php
run：R W Run
class-wp-html-active-formatting-elements.php
7.09 KB
2026-03-11 16:18:52
R W Run
class-wp-html-attribute-token.php
2.71 KB
2026-03-11 16:18:52
R W Run
class-wp-html-decoder.php
16.3 KB
2026-03-11 16:18:52
R W Run
class-wp-html-doctype-info.php
24.79 KB
2026-03-11 16:18:52
R W Run
class-wp-html-open-elements.php
21.95 KB
2026-03-11 16:18:52
R W Run
class-wp-html-processor-state.php
11.07 KB
2026-03-11 16:18:52
R W Run
class-wp-html-processor.php
208.44 KB
2026-03-11 16:18:52
R W Run
class-wp-html-span.php
1.07 KB
2026-03-11 16:18:52
R W Run
class-wp-html-stack-event.php
1.6 KB
2026-03-11 16:18:52
R W Run
class-wp-html-tag-processor.php
147.75 KB
2026-03-11 16:18:52
R W Run
class-wp-html-text-replacement.php
1.38 KB
2026-03-11 16:18:52
R W Run
class-wp-html-token.php
3.33 KB
2026-03-11 16:18:52
R W Run
class-wp-html-unsupported-exception.php
3.52 KB
2026-03-11 16:18:52
R W Run
html5-named-character-references.php
78.28 KB
2026-03-11 16:18:52
R W Run
error_log
📄class-wp-html-processor.php
  1<?php
  2/**
  3 * HTML API: WP_HTML_Processor class
  4 *
  5 * @package WordPress
  6 * @subpackage HTML-API
  7 * @since 6.4.0
  8 */
  9
 10/**
 11 * Core class used to safely parse and modify an HTML document.
 12 *
 13 * The HTML Processor class properly parses and modifies HTML5 documents.
 14 *
 15 * It supports a subset of the HTML5 specification, and when it encounters
 16 * unsupported markup, it aborts early to avoid unintentionally breaking
 17 * the document. The HTML Processor should never break an HTML document.
 18 *
 19 * While the `WP_HTML_Tag_Processor` is a valuable tool for modifying
 20 * attributes on individual HTML tags, the HTML Processor is more capable
 21 * and useful for the following operations:
 22 *
 23 *  - Querying based on nested HTML structure.
 24 *
 25 * Eventually the HTML Processor will also support:
 26 *  - Wrapping a tag in surrounding HTML.
 27 *  - Unwrapping a tag by removing its parent.
 28 *  - Inserting and removing nodes.
 29 *  - Reading and changing inner content.
 30 *  - Navigating up or around HTML structure.
 31 *
 32 * ## Usage
 33 *
 34 * Use of this class requires three steps:
 35 *
 36 *   1. Call a static creator method with your input HTML document.
 37 *   2. Find the location in the document you are looking for.
 38 *   3. Request changes to the document at that location.
 39 *
 40 * Example:
 41 *
 42 *     $processor = WP_HTML_Processor::create_fragment( $html );
 43 *     if ( $processor->next_tag( array( 'breadcrumbs' => array( 'DIV', 'FIGURE', 'IMG' ) ) ) ) {
 44 *         $processor->add_class( 'responsive-image' );
 45 *     }
 46 *
 47 * #### Breadcrumbs
 48 *
 49 * Breadcrumbs represent the stack of open elements from the root
 50 * of the document or fragment down to the currently-matched node,
 51 * if one is currently selected. Call WP_HTML_Processor::get_breadcrumbs()
 52 * to inspect the breadcrumbs for a matched tag.
 53 *
 54 * Breadcrumbs can specify nested HTML structure and are equivalent
 55 * to a CSS selector comprising tag names separated by the child
 56 * combinator, such as "DIV > FIGURE > IMG".
 57 *
 58 * Since all elements find themselves inside a full HTML document
 59 * when parsed, the return value from `get_breadcrumbs()` will always
 60 * contain any implicit outermost elements. For example, when parsing
 61 * with `create_fragment()` in the `BODY` context (the default), any
 62 * tag in the given HTML document will contain `array( 'HTML', 'BODY', … )`
 63 * in its breadcrumbs.
 64 *
 65 * Despite containing the implied outermost elements in their breadcrumbs,
 66 * tags may be found with the shortest-matching breadcrumb query. That is,
 67 * `array( 'IMG' )` matches all IMG elements and `array( 'P', 'IMG' )`
 68 * matches all IMG elements directly inside a P element. To ensure that no
 69 * partial matches erroneously match it's possible to specify in a query
 70 * the full breadcrumb match all the way down from the root HTML element.
 71 *
 72 * Example:
 73 *
 74 *     $html = '<figure><img><figcaption>A <em>lovely</em> day outside</figcaption></figure>';
 75 *     //               ----- Matches here.
 76 *     $processor->next_tag( array( 'breadcrumbs' => array( 'FIGURE', 'IMG' ) ) );
 77 *
 78 *     $html = '<figure><img><figcaption>A <em>lovely</em> day outside</figcaption></figure>';
 79 *     //                                  ---- Matches here.
 80 *     $processor->next_tag( array( 'breadcrumbs' => array( 'FIGURE', 'FIGCAPTION', 'EM' ) ) );
 81 *
 82 *     $html = '<div><img></div><img>';
 83 *     //                       ----- Matches here, because IMG must be a direct child of the implicit BODY.
 84 *     $processor->next_tag( array( 'breadcrumbs' => array( 'BODY', 'IMG' ) ) );
 85 *
 86 * ## HTML Support
 87 *
 88 * This class implements a small part of the HTML5 specification.
 89 * It's designed to operate within its support and abort early whenever
 90 * encountering circumstances it can't properly handle. This is
 91 * the principle way in which this class remains as simple as possible
 92 * without cutting corners and breaking compliance.
 93 *
 94 * ### Supported elements
 95 *
 96 * If any unsupported element appears in the HTML input the HTML Processor
 97 * will abort early and stop all processing. This draconian measure ensures
 98 * that the HTML Processor won't break any HTML it doesn't fully understand.
 99 *
100 * The HTML Processor supports all elements other than a specific set:
101 *
102 *  - Any element inside a TABLE.
103 *  - Any element inside foreign content, including SVG and MATH.
104 *  - Any element outside the IN BODY insertion mode, e.g. doctype declarations, meta, links.
105 *
106 * ### Supported markup
107 *
108 * Some kinds of non-normative HTML involve reconstruction of formatting elements and
109 * re-parenting of mis-nested elements. For example, a DIV tag found inside a TABLE
110 * may in fact belong _before_ the table in the DOM. If the HTML Processor encounters
111 * such a case it will stop processing.
112 *
113 * The following list illustrates some common examples of unexpected HTML inputs that
114 * the HTML Processor properly parses and represents:
115 *
116 *  - HTML with optional tags omitted, e.g. `<p>one<p>two`.
117 *  - HTML with unexpected tag closers, e.g. `<p>one </span> more</p>`.
118 *  - Non-void tags with self-closing flag, e.g. `<div/>the DIV is still open.</div>`.
119 *  - Heading elements which close open heading elements of another level, e.g. `<h1>Closed by </h2>`.
120 *  - Elements containing text that looks like other tags but isn't, e.g. `<title>The <img> is plaintext</title>`.
121 *  - SCRIPT and STYLE tags containing text that looks like HTML but isn't, e.g. `<script>document.write('<p>Hi</p>');</script>`.
122 *  - SCRIPT content which has been escaped, e.g. `<script><!-- document.write('<script>console.log("hi")</script>') --></script>`.
123 *
124 * ### Unsupported Features
125 *
126 * This parser does not report parse errors.
127 *
128 * Normally, when additional HTML or BODY tags are encountered in a document, if there
129 * are any additional attributes on them that aren't found on the previous elements,
130 * the existing HTML and BODY elements adopt those missing attribute values. This
131 * parser does not add those additional attributes.
132 *
133 * In certain situations, elements are moved to a different part of the document in
134 * a process called "adoption" and "fostering." Because the nodes move to a location
135 * in the document that the parser had already processed, this parser does not support
136 * these situations and will bail.
137 *
138 * @since 6.4.0
139 *
140 * @see WP_HTML_Tag_Processor
141 * @see https://html.spec.whatwg.org/
142 */
143class WP_HTML_Processor extends WP_HTML_Tag_Processor {
144	/**
145	 * The maximum number of bookmarks allowed to exist at any given time.
146	 *
147	 * HTML processing requires more bookmarks than basic tag processing,
148	 * so this class constant from the Tag Processor is overwritten.
149	 *
150	 * @since 6.4.0
151	 *
152	 * @var int
153	 */
154	const MAX_BOOKMARKS = 100;
155
156	/**
157	 * Holds the working state of the parser, including the stack of
158	 * open elements and the stack of active formatting elements.
159	 *
160	 * Initialized in the constructor.
161	 *
162	 * @since 6.4.0
163	 *
164	 * @var WP_HTML_Processor_State
165	 */
166	private $state;
167
168	/**
169	 * Used to create unique bookmark names.
170	 *
171	 * This class sets a bookmark for every tag in the HTML document that it encounters.
172	 * The bookmark name is auto-generated and increments, starting with `1`. These are
173	 * internal bookmarks and are automatically released when the referring WP_HTML_Token
174	 * goes out of scope and is garbage-collected.
175	 *
176	 * @since 6.4.0
177	 *
178	 * @see WP_HTML_Processor::$release_internal_bookmark_on_destruct
179	 *
180	 * @var int
181	 */
182	private $bookmark_counter = 0;
183
184	/**
185	 * Stores an explanation for why something failed, if it did.
186	 *
187	 * @see self::get_last_error
188	 *
189	 * @since 6.4.0
190	 *
191	 * @var string|null
192	 */
193	private $last_error = null;
194
195	/**
196	 * Stores context for why the parser bailed on unsupported HTML, if it did.
197	 *
198	 * @see self::get_unsupported_exception
199	 *
200	 * @since 6.7.0
201	 *
202	 * @var WP_HTML_Unsupported_Exception|null
203	 */
204	private $unsupported_exception = null;
205
206	/**
207	 * Releases a bookmark when PHP garbage-collects its wrapping WP_HTML_Token instance.
208	 *
209	 * This function is created inside the class constructor so that it can be passed to
210	 * the stack of open elements and the stack of active formatting elements without
211	 * exposing it as a public method on the class.
212	 *
213	 * @since 6.4.0
214	 *
215	 * @var Closure|null
216	 */
217	private $release_internal_bookmark_on_destruct = null;
218
219	/**
220	 * Stores stack events which arise during parsing of the
221	 * HTML document, which will then supply the "match" events.
222	 *
223	 * @since 6.6.0
224	 *
225	 * @var WP_HTML_Stack_Event[]
226	 */
227	private $element_queue = array();
228
229	/**
230	 * Stores the current breadcrumbs.
231	 *
232	 * @since 6.7.0
233	 *
234	 * @var string[]
235	 */
236	private $breadcrumbs = array();
237
238	/**
239	 * Current stack event, if set, representing a matched token.
240	 *
241	 * Because the parser may internally point to a place further along in a document
242	 * than the nodes which have already been processed (some "virtual" nodes may have
243	 * appeared while scanning the HTML document), this will point at the "current" node
244	 * being processed. It comes from the front of the element queue.
245	 *
246	 * @since 6.6.0
247	 *
248	 * @var WP_HTML_Stack_Event|null
249	 */
250	private $current_element = null;
251
252	/**
253	 * Context node if created as a fragment parser.
254	 *
255	 * @var WP_HTML_Token|null
256	 */
257	private $context_node = null;
258
259	/*
260	 * Public Interface Functions
261	 */
262
263	/**
264	 * Creates an HTML processor in the fragment parsing mode.
265	 *
266	 * Use this for cases where you are processing chunks of HTML that
267	 * will be found within a bigger HTML document, such as rendered
268	 * block output that exists within a post, `the_content` inside a
269	 * rendered site layout.
270	 *
271	 * Fragment parsing occurs within a context, which is an HTML element
272	 * that the document will eventually be placed in. It becomes important
273	 * when special elements have different rules than others, such as inside
274	 * a TEXTAREA or a TITLE tag where things that look like tags are text,
275	 * or inside a SCRIPT tag where things that look like HTML syntax are JS.
276	 *
277	 * The context value should be a representation of the tag into which the
278	 * HTML is found. For most cases this will be the body element. The HTML
279	 * form is provided because a context element may have attributes that
280	 * impact the parse, such as with a SCRIPT tag and its `type` attribute.
281	 *
282	 * ## Current HTML Support
283	 *
284	 *  - The only supported context is `<body>`, which is the default value.
285	 *  - The only supported document encoding is `UTF-8`, which is the default value.
286	 *
287	 * @since 6.4.0
288	 * @since 6.6.0 Returns `static` instead of `self` so it can create subclass instances.
289	 *
290	 * @param string $html     Input HTML fragment to process.
291	 * @param string $context  Context element for the fragment, must be default of `<body>`.
292	 * @param string $encoding Text encoding of the document; must be default of 'UTF-8'.
293	 * @return static|null The created processor if successful, otherwise null.
294	 */
295	public static function create_fragment( $html, $context = '<body>', $encoding = 'UTF-8' ) {
296		if ( '<body>' !== $context || 'UTF-8' !== $encoding ) {
297			return null;
298		}
299
300		if ( ! is_string( $html ) ) {
301			_doing_it_wrong(
302				__METHOD__,
303				__( 'The HTML parameter must be a string.' ),
304				'6.9.0'
305			);
306			return null;
307		}
308
309		$context_processor = static::create_full_parser( "<!DOCTYPE html>{$context}", $encoding );
310		if ( null === $context_processor ) {
311			return null;
312		}
313
314		while ( $context_processor->next_tag() ) {
315			if ( ! $context_processor->is_virtual() ) {
316				$context_processor->set_bookmark( 'final_node' );
317			}
318		}
319
320		if (
321			! $context_processor->has_bookmark( 'final_node' ) ||
322			! $context_processor->seek( 'final_node' )
323		) {
324			_doing_it_wrong( __METHOD__, __( 'No valid context element was detected.' ), '6.8.0' );
325			return null;
326		}
327
328		return $context_processor->create_fragment_at_current_node( $html );
329	}
330
331	/**
332	 * Creates an HTML processor in the full parsing mode.
333	 *
334	 * It's likely that a fragment parser is more appropriate, unless sending an
335	 * entire HTML document from start to finish. Consider a fragment parser with
336	 * a context node of `<body>`.
337	 *
338	 * UTF-8 is the only allowed encoding. If working with a document that
339	 * isn't UTF-8, first convert the document to UTF-8, then pass in the
340	 * converted HTML.
341	 *
342	 * @param string      $html                    Input HTML document to process.
343	 * @param string|null $known_definite_encoding Optional. If provided, specifies the charset used
344	 *                                             in the input byte stream. Currently must be UTF-8.
345	 * @return static|null The created processor if successful, otherwise null.
346	 */
347	public static function create_full_parser( $html, $known_definite_encoding = 'UTF-8' ) {
348		if ( 'UTF-8' !== $known_definite_encoding ) {
349			return null;
350		}
351		if ( ! is_string( $html ) ) {
352			_doing_it_wrong(
353				__METHOD__,
354				__( 'The HTML parameter must be a string.' ),
355				'6.9.0'
356			);
357			return null;
358		}
359
360		$processor                             = new static( $html, self::CONSTRUCTOR_UNLOCK_CODE );
361		$processor->state->encoding            = $known_definite_encoding;
362		$processor->state->encoding_confidence = 'certain';
363
364		return $processor;
365	}
366
367	/**
368	 * Constructor.
369	 *
370	 * Do not use this method. Use the static creator methods instead.
371	 *
372	 * @access private
373	 *
374	 * @since 6.4.0
375	 *
376	 * @see WP_HTML_Processor::create_fragment()
377	 *
378	 * @param string      $html                                  HTML to process.
379	 * @param string|null $use_the_static_create_methods_instead This constructor should not be called manually.
380	 */
381	public function __construct( $html, $use_the_static_create_methods_instead = null ) {
382		parent::__construct( $html );
383
384		if ( self::CONSTRUCTOR_UNLOCK_CODE !== $use_the_static_create_methods_instead ) {
385			_doing_it_wrong(
386				__METHOD__,
387				sprintf(
388					/* translators: %s: WP_HTML_Processor::create_fragment(). */
389					__( 'Call %s to create an HTML Processor instead of calling the constructor directly.' ),
390					'<code>WP_HTML_Processor::create_fragment()</code>'
391				),
392				'6.4.0'
393			);
394		}
395
396		$this->state = new WP_HTML_Processor_State();
397
398		$this->state->stack_of_open_elements->set_push_handler(
399			function ( WP_HTML_Token $token ): void {
400				$is_virtual            = ! isset( $this->state->current_token ) || $this->is_tag_closer();
401				$same_node             = isset( $this->state->current_token ) && $token->node_name === $this->state->current_token->node_name;
402				$provenance            = ( ! $same_node || $is_virtual ) ? 'virtual' : 'real';
403				$this->element_queue[] = new WP_HTML_Stack_Event( $token, WP_HTML_Stack_Event::PUSH, $provenance );
404
405				$this->change_parsing_namespace( $token->integration_node_type ? 'html' : $token->namespace );
406			}
407		);
408
409		$this->state->stack_of_open_elements->set_pop_handler(
410			function ( WP_HTML_Token $token ): void {
411				$is_virtual            = ! isset( $this->state->current_token ) || ! $this->is_tag_closer();
412				$same_node             = isset( $this->state->current_token ) && $token->node_name === $this->state->current_token->node_name;
413				$provenance            = ( ! $same_node || $is_virtual ) ? 'virtual' : 'real';
414				$this->element_queue[] = new WP_HTML_Stack_Event( $token, WP_HTML_Stack_Event::POP, $provenance );
415
416				$adjusted_current_node = $this->get_adjusted_current_node();
417
418				if ( $adjusted_current_node ) {
419					$this->change_parsing_namespace( $adjusted_current_node->integration_node_type ? 'html' : $adjusted_current_node->namespace );
420				} else {
421					$this->change_parsing_namespace( 'html' );
422				}
423			}
424		);
425
426		/*
427		 * Create this wrapper so that it's possible to pass
428		 * a private method into WP_HTML_Token classes without
429		 * exposing it to any public API.
430		 */
431		$this->release_internal_bookmark_on_destruct = function ( string $name ): void {
432			parent::release_bookmark( $name );
433		};
434	}
435
436	/**
437	 * Creates a fragment processor at the current node.
438	 *
439	 * HTML Fragment parsing always happens with a context node. HTML Fragment Processors can be
440	 * instantiated with a `BODY` context node via `WP_HTML_Processor::create_fragment( $html )`.
441	 *
442	 * The context node may impact how a fragment of HTML is parsed. For example, consider the HTML
443	 * fragment `<td />Inside TD?</td>`.
444	 *
445	 * A BODY context node will produce the following tree:
446	 *
447	 *     └─#text Inside TD?
448	 *
449	 * Notice that the `<td>` tags are completely ignored.
450	 *
451	 * Compare that with an SVG context node that produces the following tree:
452	 *
453	 *     ├─svg:td
454	 *     └─#text Inside TD?
455	 *
456	 * Here, a `td` node in the `svg` namespace is created, and its self-closing flag is respected.
457	 * This is a peculiarity of parsing HTML in foreign content like SVG.
458	 *
459	 * Finally, consider the tree produced with a TABLE context node:
460	 *
461	 *     └─TBODY
462	 *       └─TR
463	 *         └─TD
464	 *           └─#text Inside TD?
465	 *
466	 * These examples demonstrate how important the context node may be when processing an HTML
467	 * fragment. Special care must be taken when processing fragments that are expected to appear
468	 * in specific contexts. SVG and TABLE are good examples, but there are others.
469	 *
470	 * @see https://html.spec.whatwg.org/multipage/parsing.html#html-fragment-parsing-algorithm
471	 *
472	 * @since 6.8.0
473	 *
474	 * @param string $html Input HTML fragment to process.
475	 * @return static|null The created processor if successful, otherwise null.
476	 */
477	private function create_fragment_at_current_node( string $html ) {
478		if ( $this->get_token_type() !== '#tag' || $this->is_tag_closer() ) {
479			_doing_it_wrong(
480				__METHOD__,
481				__( 'The context element must be a start tag.' ),
482				'6.8.0'
483			);
484			return null;
485		}
486
487		$tag_name  = $this->current_element->token->node_name;
488		$namespace = $this->current_element->token->namespace;
489
490		if ( 'html' === $namespace && self::is_void( $tag_name ) ) {
491			_doing_it_wrong(
492				__METHOD__,
493				sprintf(
494					// translators: %s: A tag name like INPUT or BR.
495					__( 'The context element cannot be a void element, found "%s".' ),
496					$tag_name
497				),
498				'6.8.0'
499			);
500			return null;
501		}
502
503		/*
504		 * Prevent creating fragments at nodes that require a special tokenizer state.
505		 * This is unsupported by the HTML Processor.
506		 */
507		if (
508			'html' === $namespace &&
509			in_array( $tag_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP', 'PLAINTEXT' ), true )
510		) {
511			_doing_it_wrong(
512				__METHOD__,
513				sprintf(
514					// translators: %s: A tag name like IFRAME or TEXTAREA.
515					__( 'The context element "%s" is not supported.' ),
516					$tag_name
517				),
518				'6.8.0'
519			);
520			return null;
521		}
522
523		$fragment_processor = new static( $html, self::CONSTRUCTOR_UNLOCK_CODE );
524
525		$fragment_processor->compat_mode = $this->compat_mode;
526
527		// @todo Create "fake" bookmarks for non-existent but implied nodes.
528		$fragment_processor->bookmarks['root-node'] = new WP_HTML_Span( 0, 0 );
529		$root_node                                  = new WP_HTML_Token(
530			'root-node',
531			'HTML',
532			false
533		);
534		$fragment_processor->state->stack_of_open_elements->push( $root_node );
535
536		$fragment_processor->bookmarks['context-node']   = new WP_HTML_Span( 0, 0 );
537		$fragment_processor->context_node                = clone $this->current_element->token;
538		$fragment_processor->context_node->bookmark_name = 'context-node';
539		$fragment_processor->context_node->on_destroy    = null;
540
541		$fragment_processor->breadcrumbs = array( 'HTML', $fragment_processor->context_node->node_name );
542
543		if ( 'TEMPLATE' === $fragment_processor->context_node->node_name ) {
544			$fragment_processor->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
545		}
546
547		$fragment_processor->reset_insertion_mode_appropriately();
548
549		/*
550		 * > Set the parser's form element pointer to the nearest node to the context element that
551		 * > is a form element (going straight up the ancestor chain, and including the element
552		 * > itself, if it is a form element), if any. (If there is no such form element, the
553		 * > form element pointer keeps its initial value, null.)
554		 */
555		foreach ( $this->state->stack_of_open_elements->walk_up() as $element ) {
556			if ( 'FORM' === $element->node_name && 'html' === $element->namespace ) {
557				$fragment_processor->state->form_element                = clone $element;
558				$fragment_processor->state->form_element->bookmark_name = null;
559				$fragment_processor->state->form_element->on_destroy    = null;
560				break;
561			}
562		}
563
564		$fragment_processor->state->encoding_confidence = 'irrelevant';
565
566		/*
567		 * Update the parsing namespace near the end of the process.
568		 * This is important so that any push/pop from the stack of open
569		 * elements does not change the parsing namespace.
570		 */
571		$fragment_processor->change_parsing_namespace(
572			$this->current_element->token->integration_node_type ? 'html' : $namespace
573		);
574
575		return $fragment_processor;
576	}
577
578	/**
579	 * Stops the parser and terminates its execution when encountering unsupported markup.
580	 *
581	 * @throws WP_HTML_Unsupported_Exception Halts execution of the parser.
582	 *
583	 * @since 6.7.0
584	 *
585	 * @param string $message Explains support is missing in order to parse the current node.
586	 */
587	private function bail( string $message ) {
588		$here  = $this->bookmarks[ $this->state->current_token->bookmark_name ];
589		$token = substr( $this->html, $here->start, $here->length );
590
591		$open_elements = array();
592		foreach ( $this->state->stack_of_open_elements->stack as $item ) {
593			$open_elements[] = $item->node_name;
594		}
595
596		$active_formats = array();
597		foreach ( $this->state->active_formatting_elements->walk_down() as $item ) {
598			$active_formats[] = $item->node_name;
599		}
600
601		$this->last_error = self::ERROR_UNSUPPORTED;
602
603		$this->unsupported_exception = new WP_HTML_Unsupported_Exception(
604			$message,
605			$this->state->current_token->node_name,
606			$here->start,
607			$token,
608			$open_elements,
609			$active_formats
610		);
611
612		throw $this->unsupported_exception;
613	}
614
615	/**
616	 * Returns the last error, if any.
617	 *
618	 * Various situations lead to parsing failure but this class will
619	 * return `false` in all those cases. To determine why something
620	 * failed it's possible to request the last error. This can be
621	 * helpful to know to distinguish whether a given tag couldn't
622	 * be found or if content in the document caused the processor
623	 * to give up and abort processing.
624	 *
625	 * Example
626	 *
627	 *     $processor = WP_HTML_Processor::create_fragment( '<template><strong><button><em><p><em>' );
628	 *     false === $processor->next_tag();
629	 *     WP_HTML_Processor::ERROR_UNSUPPORTED === $processor->get_last_error();
630	 *
631	 * @since 6.4.0
632	 *
633	 * @see self::ERROR_UNSUPPORTED
634	 * @see self::ERROR_EXCEEDED_MAX_BOOKMARKS
635	 *
636	 * @return string|null The last error, if one exists, otherwise null.
637	 */
638	public function get_last_error(): ?string {
639		return $this->last_error;
640	}
641
642	/**
643	 * Returns context for why the parser aborted due to unsupported HTML, if it did.
644	 *
645	 * This is meant for debugging purposes, not for production use.
646	 *
647	 * @since 6.7.0
648	 *
649	 * @see self::$unsupported_exception
650	 *
651	 * @return WP_HTML_Unsupported_Exception|null
652	 */
653	public function get_unsupported_exception() {
654		return $this->unsupported_exception;
655	}
656
657	/**
658	 * Finds the next tag matching the $query.
659	 *
660	 * @todo Support matching the class name and tag name.
661	 *
662	 * @since 6.4.0
663	 * @since 6.6.0 Visits all tokens, including virtual ones.
664	 *
665	 * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
666	 *
667	 * @param array|string|null $query {
668	 *     Optional. Which tag name to find, having which class, etc. Default is to find any tag.
669	 *
670	 *     @type string|null $tag_name     Which tag to find, or `null` for "any tag."
671	 *     @type string      $tag_closers  'visit' to pause at tag closers, 'skip' or unset to only visit openers.
672	 *     @type int|null    $match_offset Find the Nth tag matching all search criteria.
673	 *                                     1 for "first" tag, 3 for "third," etc.
674	 *                                     Defaults to first tag.
675	 *     @type string|null $class_name   Tag must contain this whole class name to match.
676	 *     @type string[]    $breadcrumbs  DOM sub-path at which element is found, e.g. `array( 'FIGURE', 'IMG' )`.
677	 *                                     May also contain the wildcard `*` which matches a single element, e.g. `array( 'SECTION', '*' )`.
678	 * }
679	 * @return bool Whether a tag was matched.
680	 */
681	public function next_tag( $query = null ): bool {
682		$visit_closers = isset( $query['tag_closers'] ) && 'visit' === $query['tag_closers'];
683
684		if ( null === $query ) {
685			while ( $this->next_token() ) {
686				if ( '#tag' !== $this->get_token_type() ) {
687					continue;
688				}
689
690				if ( ! $this->is_tag_closer() || $visit_closers ) {
691					return true;
692				}
693			}
694
695			return false;
696		}
697
698		if ( is_string( $query ) ) {
699			$query = array( 'breadcrumbs' => array( $query ) );
700		}
701
702		if ( ! is_array( $query ) ) {
703			_doing_it_wrong(
704				__METHOD__,
705				__( 'Please pass a query array to this function.' ),
706				'6.4.0'
707			);
708			return false;
709		}
710
711		if ( isset( $query['tag_name'] ) ) {
712			$query['tag_name'] = strtoupper( $query['tag_name'] );
713		}
714
715		$needs_class = ( isset( $query['class_name'] ) && is_string( $query['class_name'] ) )
716			? $query['class_name']
717			: null;
718
719		if ( ! ( array_key_exists( 'breadcrumbs', $query ) && is_array( $query['breadcrumbs'] ) ) ) {
720			while ( $this->next_token() ) {
721				if ( '#tag' !== $this->get_token_type() ) {
722					continue;
723				}
724
725				if ( isset( $query['tag_name'] ) && $query['tag_name'] !== $this->get_token_name() ) {
726					continue;
727				}
728
729				if ( isset( $needs_class ) && ! $this->has_class( $needs_class ) ) {
730					continue;
731				}
732
733				if ( ! $this->is_tag_closer() || $visit_closers ) {
734					return true;
735				}
736			}
737
738			return false;
739		}
740
741		$breadcrumbs  = $query['breadcrumbs'];
742		$match_offset = isset( $query['match_offset'] ) ? (int) $query['match_offset'] : 1;
743
744		while ( $match_offset > 0 && $this->next_token() ) {
745			if ( '#tag' !== $this->get_token_type() || $this->is_tag_closer() ) {
746				continue;
747			}
748
749			if ( isset( $needs_class ) && ! $this->has_class( $needs_class ) ) {
750				continue;
751			}
752
753			if ( $this->matches_breadcrumbs( $breadcrumbs ) && 0 === --$match_offset ) {
754				return true;
755			}
756		}
757
758		return false;
759	}
760
761	/**
762	 * Finds the next token in the HTML document.
763	 *
764	 * This doesn't currently have a way to represent non-tags and doesn't process
765	 * semantic rules for text nodes. For access to the raw tokens consider using
766	 * WP_HTML_Tag_Processor instead.
767	 *
768	 * @since 6.5.0 Added for internal support; do not use.
769	 * @since 6.7.2 Refactored so subclasses may extend.
770	 *
771	 * @return bool Whether a token was parsed.
772	 */
773	public function next_token(): bool {
774		return $this->next_visitable_token();
775	}
776
777	/**
778	 * Ensures internal accounting is maintained for HTML semantic rules while
779	 * the underlying Tag Processor class is seeking to a bookmark.
780	 *
781	 * This doesn't currently have a way to represent non-tags and doesn't process
782	 * semantic rules for text nodes. For access to the raw tokens consider using
783	 * WP_HTML_Tag_Processor instead.
784	 *
785	 * Note that this method may call itself recursively. This is why it is not
786	 * implemented as {@see WP_HTML_Processor::next_token()}, which instead calls
787	 * this method similarly to how {@see WP_HTML_Tag_Processor::next_token()}
788	 * calls the {@see WP_HTML_Tag_Processor::base_class_next_token()} method.
789	 *
790	 * @since 6.7.2 Added for internal support.
791	 *
792	 * @access private
793	 *
794	 * @return bool
795	 */
796	private function next_visitable_token(): bool {
797		$this->current_element = null;
798
799		if ( isset( $this->last_error ) ) {
800			return false;
801		}
802
803		/*
804		 * Prime the events if there are none.
805		 *
806		 * @todo In some cases, probably related to the adoption agency
807		 *       algorithm, this call to step() doesn't create any new
808		 *       events. Calling it again creates them. Figure out why
809		 *       this is and if it's inherent or if it's a bug. Looping
810		 *       until there are events or until there are no more
811		 *       tokens works in the meantime and isn't obviously wrong.
812		 */
813		if ( empty( $this->element_queue ) && $this->step() ) {
814			return $this->next_visitable_token();
815		}
816
817		// Process the next event on the queue.
818		$this->current_element = array_shift( $this->element_queue );
819		if ( ! isset( $this->current_element ) ) {
820			// There are no tokens left, so close all remaining open elements.
821			while ( $this->state->stack_of_open_elements->pop() ) {
822				continue;
823			}
824
825			return empty( $this->element_queue ) ? false : $this->next_visitable_token();
826		}
827
828		$is_pop = WP_HTML_Stack_Event::POP === $this->current_element->operation;
829
830		/*
831		 * The root node only exists in the fragment parser, and closing it
832		 * indicates that the parse is complete. Stop before popping it from
833		 * the breadcrumbs.
834		 */
835		if ( 'root-node' === $this->current_element->token->bookmark_name ) {
836			return $this->next_visitable_token();
837		}
838
839		// Adjust the breadcrumbs for this event.
840		if ( $is_pop ) {
841			array_pop( $this->breadcrumbs );
842		} else {
843			$this->breadcrumbs[] = $this->current_element->token->node_name;
844		}
845
846		// Avoid sending close events for elements which don't expect a closing.
847		if ( $is_pop && ! $this->expects_closer( $this->current_element->token ) ) {
848			return $this->next_visitable_token();
849		}
850
851		return true;
852	}
853
854	/**
855	 * Indicates if the current tag token is a tag closer.
856	 *
857	 * Example:
858	 *
859	 *     $p = WP_HTML_Processor::create_fragment( '<div></div>' );
860	 *     $p->next_tag( array( 'tag_name' => 'div', 'tag_closers' => 'visit' ) );
861	 *     $p->is_tag_closer() === false;
862	 *
863	 *     $p->next_tag( array( 'tag_name' => 'div', 'tag_closers' => 'visit' ) );
864	 *     $p->is_tag_closer() === true;
865	 *
866	 * @since 6.6.0 Subclassed for HTML Processor.
867	 *
868	 * @return bool Whether the current tag is a tag closer.
869	 */
870	public function is_tag_closer(): bool {
871		return $this->is_virtual()
872			? ( WP_HTML_Stack_Event::POP === $this->current_element->operation && '#tag' === $this->get_token_type() )
873			: parent::is_tag_closer();
874	}
875
876	/**
877	 * Indicates if the currently-matched token is virtual, created by a stack operation
878	 * while processing HTML, rather than a token found in the HTML text itself.
879	 *
880	 * @since 6.6.0
881	 *
882	 * @return bool Whether the current token is virtual.
883	 */
884	private function is_virtual(): bool {
885		return (
886			isset( $this->current_element->provenance ) &&
887			'virtual' === $this->current_element->provenance
888		);
889	}
890
891	/**
892	 * Indicates if the currently-matched tag matches the given breadcrumbs.
893	 *
894	 * A "*" represents a single tag wildcard, where any tag matches, but not no tags.
895	 *
896	 * At some point this function _may_ support a `**` syntax for matching any number
897	 * of unspecified tags in the breadcrumb stack. This has been intentionally left
898	 * out, however, to keep this function simple and to avoid introducing backtracking,
899	 * which could open up surprising performance breakdowns.
900	 *
901	 * Example:
902	 *
903	 *     $processor = WP_HTML_Processor::create_fragment( '<div><span><figure><img></figure></span></div>' );
904	 *     $processor->next_tag( 'img' );
905	 *     true  === $processor->matches_breadcrumbs( array( 'figure', 'img' ) );
906	 *     true  === $processor->matches_breadcrumbs( array( 'span', 'figure', 'img' ) );
907	 *     false === $processor->matches_breadcrumbs( array( 'span', 'img' ) );
908	 *     true  === $processor->matches_breadcrumbs( array( 'span', '*', 'img' ) );
909	 *
910	 * @since 6.4.0
911	 *
912	 * @param string[] $breadcrumbs DOM sub-path at which element is found, e.g. `array( 'FIGURE', 'IMG' )`.
913	 *                              May also contain the wildcard `*` which matches a single element, e.g. `array( 'SECTION', '*' )`.
914	 * @return bool Whether the currently-matched tag is found at the given nested structure.
915	 */
916	public function matches_breadcrumbs( $breadcrumbs ): bool {
917		// Everything matches when there are zero constraints.
918		if ( 0 === count( $breadcrumbs ) ) {
919			return true;
920		}
921
922		// Start at the last crumb.
923		$crumb = end( $breadcrumbs );
924
925		if ( '*' !== $crumb && $this->get_tag() !== strtoupper( $crumb ) ) {
926			return false;
927		}
928
929		for ( $i = count( $this->breadcrumbs ) - 1; $i >= 0; $i-- ) {
930			$node  = $this->breadcrumbs[ $i ];
931			$crumb = strtoupper( current( $breadcrumbs ) );
932
933			if ( '*' !== $crumb && $node !== $crumb ) {
934				return false;
935			}
936
937			if ( false === prev( $breadcrumbs ) ) {
938				return true;
939			}
940		}
941
942		return false;
943	}
944
945	/**
946	 * Indicates if the currently-matched node expects a closing
947	 * token, or if it will self-close on the next step.
948	 *
949	 * Most HTML elements expect a closer, such as a P element or
950	 * a DIV element. Others, like an IMG element are void and don't
951	 * have a closing tag. Special elements, such as SCRIPT and STYLE,
952	 * are treated just like void tags. Text nodes and self-closing
953	 * foreign content will also act just like a void tag, immediately
954	 * closing as soon as the processor advances to the next token.
955	 *
956	 * @since 6.6.0
957	 *
958	 * @param WP_HTML_Token|null $node Optional. Node to examine, if provided.
959	 *                                 Default is to examine current node.
960	 * @return bool|null Whether to expect a closer for the currently-matched node,
961	 *                   or `null` if not matched on any token.
962	 */
963	public function expects_closer( ?WP_HTML_Token $node = null ): ?bool {
964		$token_name = $node->node_name ?? $this->get_token_name();
965
966		if ( ! isset( $token_name ) ) {
967			return null;
968		}
969
970		$token_namespace        = $node->namespace ?? $this->get_namespace();
971		$token_has_self_closing = $node->has_self_closing_flag ?? $this->has_self_closing_flag();
972
973		return ! (
974			// Comments, text nodes, and other atomic tokens.
975			'#' === $token_name[0] ||
976			// Doctype declarations.
977			'html' === $token_name ||
978			// Void elements.
979			( 'html' === $token_namespace && self::is_void( $token_name ) ) ||
980			// Special atomic elements.
981			( 'html' === $token_namespace && in_array( $token_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP' ), true ) ) ||
982			// Self-closing elements in foreign content.
983			( 'html' !== $token_namespace && $token_has_self_closing )
984		);
985	}
986
987	/**
988	 * Steps through the HTML document and stop at the next tag, if any.
989	 *
990	 * @since 6.4.0
991	 *
992	 * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
993	 *
994	 * @see self::PROCESS_NEXT_NODE
995	 * @see self::REPROCESS_CURRENT_NODE
996	 *
997	 * @param string $node_to_process Whether to parse the next node or reprocess the current node.
998	 * @return bool Whether a tag was matched.
999	 */
1000	public function step( $node_to_process = self::PROCESS_NEXT_NODE ): bool {
1001		// Refuse to proceed if there was a previous error.
1002		if ( null !== $this->last_error ) {
1003			return false;
1004		}
1005
1006		if ( self::REPROCESS_CURRENT_NODE !== $node_to_process ) {
1007			/*
1008			 * Void elements still hop onto the stack of open elements even though
1009			 * there's no corresponding closing tag. This is important for managing
1010			 * stack-based operations such as "navigate to parent node" or checking
1011			 * on an element's breadcrumbs.
1012			 *
1013			 * When moving on to the next node, therefore, if the bottom-most element
1014			 * on the stack is a void element, it must be closed.
1015			 */
1016			$top_node = $this->state->stack_of_open_elements->current_node();
1017			if ( isset( $top_node ) && ! $this->expects_closer( $top_node ) ) {
1018				$this->state->stack_of_open_elements->pop();
1019			}
1020		}
1021
1022		if ( self::PROCESS_NEXT_NODE === $node_to_process ) {
1023			parent::next_token();
1024			if ( WP_HTML_Tag_Processor::STATE_TEXT_NODE === $this->parser_state ) {
1025				parent::subdivide_text_appropriately();
1026			}
1027		}
1028
1029		// Finish stepping when there are no more tokens in the document.
1030		if (
1031			WP_HTML_Tag_Processor::STATE_INCOMPLETE_INPUT === $this->parser_state ||
1032			WP_HTML_Tag_Processor::STATE_COMPLETE === $this->parser_state
1033		) {
1034			return false;
1035		}
1036
1037		$adjusted_current_node = $this->get_adjusted_current_node();
1038		$is_closer             = $this->is_tag_closer();
1039		$is_start_tag          = WP_HTML_Tag_Processor::STATE_MATCHED_TAG === $this->parser_state && ! $is_closer;
1040		$token_name            = $this->get_token_name();
1041
1042		if ( self::REPROCESS_CURRENT_NODE !== $node_to_process ) {
1043			$this->state->current_token = new WP_HTML_Token(
1044				$this->bookmark_token(),
1045				$token_name,
1046				$this->has_self_closing_flag(),
1047				$this->release_internal_bookmark_on_destruct
1048			);
1049		}
1050
1051		$parse_in_current_insertion_mode = (
1052			0 === $this->state->stack_of_open_elements->count() ||
1053			'html' === $adjusted_current_node->namespace ||
1054			(
1055				'math' === $adjusted_current_node->integration_node_type &&
1056				(
1057					( $is_start_tag && ! in_array( $token_name, array( 'MGLYPH', 'MALIGNMARK' ), true ) ) ||
1058					'#text' === $token_name
1059				)
1060			) ||
1061			(
1062				'math' === $adjusted_current_node->namespace &&
1063				'ANNOTATION-XML' === $adjusted_current_node->node_name &&
1064				$is_start_tag && 'SVG' === $token_name
1065			) ||
1066			(
1067				'html' === $adjusted_current_node->integration_node_type &&
1068				( $is_start_tag || '#text' === $token_name )
1069			)
1070		);
1071
1072		try {
1073			if ( ! $parse_in_current_insertion_mode ) {
1074				return $this->step_in_foreign_content();
1075			}
1076
1077			switch ( $this->state->insertion_mode ) {
1078				case WP_HTML_Processor_State::INSERTION_MODE_INITIAL:
1079					return $this->step_initial();
1080
1081				case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML:
1082					return $this->step_before_html();
1083
1084				case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD:
1085					return $this->step_before_head();
1086
1087				case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD:
1088					return $this->step_in_head();
1089
1090				case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT:
1091					return $this->step_in_head_noscript();
1092
1093				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD:
1094					return $this->step_after_head();
1095
1096				case WP_HTML_Processor_State::INSERTION_MODE_IN_BODY:
1097					return $this->step_in_body();
1098
1099				case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
1100					return $this->step_in_table();
1101
1102				case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT:
1103					return $this->step_in_table_text();
1104
1105				case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
1106					return $this->step_in_caption();
1107
1108				case WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP:
1109					return $this->step_in_column_group();
1110
1111				case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
1112					return $this->step_in_table_body();
1113
1114				case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
1115					return $this->step_in_row();
1116
1117				case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
1118					return $this->step_in_cell();
1119
1120				case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT:
1121					return $this->step_in_select();
1122
1123				case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE:
1124					return $this->step_in_select_in_table();
1125
1126				case WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE:
1127					return $this->step_in_template();
1128
1129				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY:
1130					return $this->step_after_body();
1131
1132				case WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET:
1133					return $this->step_in_frameset();
1134
1135				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET:
1136					return $this->step_after_frameset();
1137
1138				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY:
1139					return $this->step_after_after_body();
1140
1141				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET:
1142					return $this->step_after_after_frameset();
1143
1144				// This should be unreachable but PHP doesn't have total type checking on switch.
1145				default:
1146					$this->bail( "Unaware of the requested parsing mode: '{$this->state->insertion_mode}'." );
1147			}
1148		} catch ( WP_HTML_Unsupported_Exception $e ) {
1149			/*
1150			 * Exceptions are used in this class to escape deep call stacks that
1151			 * otherwise might involve messier calling and return conventions.
1152			 */
1153			return false;
1154		}
1155	}
1156
1157	/**
1158	 * Computes the HTML breadcrumbs for the currently-matched node, if matched.
1159	 *
1160	 * Breadcrumbs start at the outermost parent and descend toward the matched element.
1161	 * They always include the entire path from the root HTML node to the matched element.
1162	 *
1163	 * Example:
1164	 *
1165	 *     $processor = WP_HTML_Processor::create_fragment( '<p><strong><em><img></em></strong></p>' );
1166	 *     $processor->next_tag( 'IMG' );
1167	 *     $processor->get_breadcrumbs() === array( 'HTML', 'BODY', 'P', 'STRONG', 'EM', 'IMG' );
1168	 *
1169	 * @since 6.4.0
1170	 *
1171	 * @return string[] Array of tag names representing path to matched node.
1172	 */
1173	public function get_breadcrumbs(): array {
1174		return $this->breadcrumbs;
1175	}
1176
1177	/**
1178	 * Returns the nesting depth of the current location in the document.
1179	 *
1180	 * Example:
1181	 *
1182	 *     $processor = WP_HTML_Processor::create_fragment( '<div><p></p></div>' );
1183	 *     // The processor starts in the BODY context, meaning it has depth from the start: HTML > BODY.
1184	 *     2 === $processor->get_current_depth();
1185	 *
1186	 *     // Opening the DIV element increases the depth.
1187	 *     $processor->next_token();
1188	 *     3 === $processor->get_current_depth();
1189	 *
1190	 *     // Opening the P element increases the depth.
1191	 *     $processor->next_token();
1192	 *     4 === $processor->get_current_depth();
1193	 *
1194	 *     // The P element is closed during `next_token()` so the depth is decreased to reflect that.
1195	 *     $processor->next_token();
1196	 *     3 === $processor->get_current_depth();
1197	 *
1198	 * @since 6.6.0
1199	 *
1200	 * @return int Nesting-depth of current location in the document.
1201	 */
1202	public function get_current_depth(): int {
1203		return count( $this->breadcrumbs );
1204	}
1205
1206	/**
1207	 * Normalizes an HTML fragment by serializing it.
1208	 *
1209	 * This method assumes that the given HTML snippet is found in BODY context.
1210	 * For normalizing full documents or fragments found in other contexts, create
1211	 * a new processor using {@see WP_HTML_Processor::create_fragment} or
1212	 * {@see WP_HTML_Processor::create_full_parser} and call {@see WP_HTML_Processor::serialize}
1213	 * on the created instances.
1214	 *
1215	 * Many aspects of an input HTML fragment may be changed during normalization.
1216	 *
1217	 *  - Attribute values will be double-quoted.
1218	 *  - Duplicate attributes will be removed.
1219	 *  - Omitted tags will be added.
1220	 *  - Tag and attribute name casing will be lower-cased,
1221	 *    except for specific SVG and MathML tags or attributes.
1222	 *  - Text will be re-encoded, null bytes handled,
1223	 *    and invalid UTF-8 replaced with U+FFFD.
1224	 *  - Any incomplete syntax trailing at the end will be omitted,
1225	 *    for example, an unclosed comment opener will be removed.
1226	 *
1227	 * Example:
1228	 *
1229	 *     echo WP_HTML_Processor::normalize( '<a href=#anchor v=5 href="/" enabled>One</a another v=5><!--' );
1230	 *     // <a href="#anchor" v="5" enabled>One</a>
1231	 *
1232	 *     echo WP_HTML_Processor::normalize( '<div></p>fun<table><td>cell</div>' );
1233	 *     // <div><p></p>fun<table><tbody><tr><td>cell</td></tr></tbody></table></div>
1234	 *
1235	 *     echo WP_HTML_Processor::normalize( '<![CDATA[invalid comment]]> syntax < <> "oddities"' );
1236	 *     // <!--[CDATA[invalid comment]]--> syntax &lt; &lt;&gt; &quot;oddities&quot;
1237	 *
1238	 * @since 6.7.0
1239	 *
1240	 * @param string $html Input HTML to normalize.
1241	 *
1242	 * @return string|null Normalized output, or `null` if unable to normalize.
1243	 */
1244	public static function normalize( string $html ): ?string {
1245		return static::create_fragment( $html )->serialize();
1246	}
1247
1248	/**
1249	 * Returns normalized HTML for a fragment by serializing it.
1250	 *
1251	 * This differs from {@see WP_HTML_Processor::normalize} in that it starts with
1252	 * a specific HTML Processor, which _must_ not have already started scanning;
1253	 * it must be in the initial ready state and will be in the completed state once
1254	 * serialization is complete.
1255	 *
1256	 * Many aspects of an input HTML fragment may be changed during normalization.
1257	 *
1258	 *  - Attribute values will be double-quoted.
1259	 *  - Duplicate attributes will be removed.
1260	 *  - Omitted tags will be added.
1261	 *  - Tag and attribute name casing will be lower-cased,
1262	 *    except for specific SVG and MathML tags or attributes.
1263	 *  - Text will be re-encoded, null bytes handled,
1264	 *    and invalid UTF-8 replaced with U+FFFD.
1265	 *  - Any incomplete syntax trailing at the end will be omitted,
1266	 *    for example, an unclosed comment opener will be removed.
1267	 *
1268	 * Example:
1269	 *
1270	 *     $processor = WP_HTML_Processor::create_fragment( '<a href=#anchor v=5 href="/" enabled>One</a another v=5><!--' );
1271	 *     echo $processor->serialize();
1272	 *     // <a href="#anchor" v="5" enabled>One</a>
1273	 *
1274	 *     $processor = WP_HTML_Processor::create_fragment( '<div></p>fun<table><td>cell</div>' );
1275	 *     echo $processor->serialize();
1276	 *     // <div><p></p>fun<table><tbody><tr><td>cell</td></tr></tbody></table></div>
1277	 *
1278	 *     $processor = WP_HTML_Processor::create_fragment( '<![CDATA[invalid comment]]> syntax < <> "oddities"' );
1279	 *     echo $processor->serialize();
1280	 *     // <!--[CDATA[invalid comment]]--> syntax &lt; &lt;&gt; &quot;oddities&quot;
1281	 *
1282	 * @since 6.7.0
1283	 *
1284	 * @return string|null Normalized HTML markup represented by processor,
1285	 *                     or `null` if unable to generate serialization.
1286	 */
1287	public function serialize(): ?string {
1288		if ( WP_HTML_Tag_Processor::STATE_READY !== $this->parser_state ) {
1289			wp_trigger_error(
1290				__METHOD__,
1291				'An HTML Processor which has already started processing cannot serialize its contents. Serialize immediately after creating the instance.',
1292				E_USER_WARNING
1293			);
1294			return null;
1295		}
1296
1297		$html = '';
1298		while ( $this->next_token() ) {
1299			$html .= $this->serialize_token();
1300		}
1301
1302		if ( null !== $this->get_last_error() ) {
1303			wp_trigger_error(
1304				__METHOD__,
1305				"Cannot serialize HTML Processor with parsing error: {$this->get_last_error()}.",
1306				E_USER_WARNING
1307			);
1308			return null;
1309		}
1310
1311		return $html;
1312	}
1313
1314	/**
1315	 * Serializes the currently-matched token.
1316	 *
1317	 * This method produces a fully-normative HTML string for the currently-matched token,
1318	 * if able. If not matched at any token or if the token doesn't correspond to any HTML
1319	 * it will return an empty string (for example, presumptuous end tags are ignored).
1320	 *
1321	 * @see static::serialize()
1322	 *
1323	 * @since 6.7.0
1324	 * @since 6.9.0 Converted from protected to public method.
1325	 *
1326	 * @return string Serialization of token, or empty string if no serialization exists.
1327	 */
1328	public function serialize_token(): string {
1329		$html       = '';
1330		$token_type = $this->get_token_type();
1331
1332		switch ( $token_type ) {
1333			case '#doctype':
1334				$doctype = $this->get_doctype_info();
1335				if ( null === $doctype ) {
1336					break;
1337				}
1338
1339				$html .= '<!DOCTYPE';
1340
1341				if ( $doctype->name ) {
1342					$html .= " {$doctype->name}";
1343				}
1344
1345				if ( null !== $doctype->public_identifier ) {
1346					$quote = str_contains( $doctype->public_identifier, '"' ) ? "'" : '"';
1347					$html .= " PUBLIC {$quote}{$doctype->public_identifier}{$quote}";
1348				}
1349				if ( null !== $doctype->system_identifier ) {
1350					if ( null === $doctype->public_identifier ) {
1351						$html .= ' SYSTEM';
1352					}
1353					$quote = str_contains( $doctype->system_identifier, '"' ) ? "'" : '"';
1354					$html .= " {$quote}{$doctype->system_identifier}{$quote}";
1355				}
1356
1357				$html .= '>';
1358				break;
1359
1360			case '#text':
1361				$html .= htmlspecialchars( $this->get_modifiable_text(), ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8' );
1362				break;
1363
1364			// Unlike the `<>` which is interpreted as plaintext, this is ignored entirely.
1365			case '#presumptuous-tag':
1366				break;
1367
1368			case '#funky-comment':
1369			case '#comment':
1370				$html .= "<!--{$this->get_full_comment_text()}-->";
1371				break;
1372
1373			case '#cdata-section':
1374				$html .= "<![CDATA[{$this->get_modifiable_text()}]]>";
1375				break;
1376		}
1377
1378		if ( '#tag' !== $token_type ) {
1379			return $html;
1380		}
1381
1382		$tag_name       = str_replace( "\x00", "\u{FFFD}", $this->get_tag() );
1383		$in_html        = 'html' === $this->get_namespace();
1384		$qualified_name = $in_html ? strtolower( $tag_name ) : $this->get_qualified_tag_name();
1385
1386		if ( $this->is_tag_closer() ) {
1387			$html .= "</{$qualified_name}>";
1388			return $html;
1389		}
1390
1391		$attribute_names = $this->get_attribute_names_with_prefix( '' );
1392		if ( ! isset( $attribute_names ) ) {
1393			$html .= "<{$qualified_name}>";
1394			return $html;
1395		}
1396
1397		$html .= "<{$qualified_name}";
1398		foreach ( $attribute_names as $attribute_name ) {
1399			$html .= " {$this->get_qualified_attribute_name( $attribute_name )}";
1400			$value = $this->get_attribute( $attribute_name );
1401
1402			if ( is_string( $value ) ) {
1403				$html .= '="' . htmlspecialchars( $value, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5 ) . '"';
1404			}
1405
1406			$html = str_replace( "\x00", "\u{FFFD}", $html );
1407		}
1408
1409		if ( ! $in_html && $this->has_self_closing_flag() ) {
1410			$html .= ' /';
1411		}
1412
1413		$html .= '>';
1414
1415		// Flush out self-contained elements.
1416		if ( $in_html && in_array( $tag_name, array( 'IFRAME', 'NOEMBED', 'NOFRAMES', 'SCRIPT', 'STYLE', 'TEXTAREA', 'TITLE', 'XMP' ), true ) ) {
1417			$text = $this->get_modifiable_text();
1418
1419			switch ( $tag_name ) {
1420				case 'IFRAME':
1421				case 'NOEMBED':
1422				case 'NOFRAMES':
1423					$text = '';
1424					break;
1425
1426				case 'SCRIPT':
1427				case 'STYLE':
1428					break;
1429
1430				default:
1431					$text = htmlspecialchars( $text, ENT_QUOTES | ENT_SUBSTITUTE | ENT_HTML5, 'UTF-8' );
1432			}
1433
1434			$html .= "{$text}</{$qualified_name}>";
1435		}
1436
1437		return $html;
1438	}
1439
1440	/**
1441	 * Parses next element in the 'initial' insertion mode.
1442	 *
1443	 * This internal function performs the 'initial' insertion mode
1444	 * logic for the generalized WP_HTML_Processor::step() function.
1445	 *
1446	 * @since 6.7.0
1447	 *
1448	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1449	 *
1450	 * @see https://html.spec.whatwg.org/#the-initial-insertion-mode
1451	 * @see WP_HTML_Processor::step
1452	 *
1453	 * @return bool Whether an element was found.
1454	 */
1455	private function step_initial(): bool {
1456		$token_name = $this->get_token_name();
1457		$token_type = $this->get_token_type();
1458		$op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
1459		$op         = "{$op_sigil}{$token_name}";
1460
1461		switch ( $op ) {
1462			/*
1463			 * > A character token that is one of U+0009 CHARACTER TABULATION,
1464			 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1465			 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1466			 *
1467			 * Parse error: ignore the token.
1468			 */
1469			case '#text':
1470				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1471					return $this->step();
1472				}
1473				goto initial_anything_else;
1474				break;
1475
1476			/*
1477			 * > A comment token
1478			 */
1479			case '#comment':
1480			case '#funky-comment':
1481			case '#presumptuous-tag':
1482				$this->insert_html_element( $this->state->current_token );
1483				return true;
1484
1485			/*
1486			 * > A DOCTYPE token
1487			 */
1488			case 'html':
1489				$doctype = $this->get_doctype_info();
1490				if ( null !== $doctype && 'quirks' === $doctype->indicated_compatibility_mode ) {
1491					$this->compat_mode = WP_HTML_Tag_Processor::QUIRKS_MODE;
1492				}
1493
1494				/*
1495				 * > Then, switch the insertion mode to "before html".
1496				 */
1497				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
1498				$this->insert_html_element( $this->state->current_token );
1499				return true;
1500		}
1501
1502		/*
1503		 * > Anything else
1504		 */
1505		initial_anything_else:
1506		$this->compat_mode           = WP_HTML_Tag_Processor::QUIRKS_MODE;
1507		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML;
1508		return $this->step( self::REPROCESS_CURRENT_NODE );
1509	}
1510
1511	/**
1512	 * Parses next element in the 'before html' insertion mode.
1513	 *
1514	 * This internal function performs the 'before html' insertion mode
1515	 * logic for the generalized WP_HTML_Processor::step() function.
1516	 *
1517	 * @since 6.7.0
1518	 *
1519	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1520	 *
1521	 * @see https://html.spec.whatwg.org/#the-before-html-insertion-mode
1522	 * @see WP_HTML_Processor::step
1523	 *
1524	 * @return bool Whether an element was found.
1525	 */
1526	private function step_before_html(): bool {
1527		$token_name = $this->get_token_name();
1528		$token_type = $this->get_token_type();
1529		$is_closer  = parent::is_tag_closer();
1530		$op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1531		$op         = "{$op_sigil}{$token_name}";
1532
1533		switch ( $op ) {
1534			/*
1535			 * > A DOCTYPE token
1536			 */
1537			case 'html':
1538				// Parse error: ignore the token.
1539				return $this->step();
1540
1541			/*
1542			 * > A comment token
1543			 */
1544			case '#comment':
1545			case '#funky-comment':
1546			case '#presumptuous-tag':
1547				$this->insert_html_element( $this->state->current_token );
1548				return true;
1549
1550			/*
1551			 * > A character token that is one of U+0009 CHARACTER TABULATION,
1552			 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1553			 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1554			 *
1555			 * Parse error: ignore the token.
1556			 */
1557			case '#text':
1558				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1559					return $this->step();
1560				}
1561				goto before_html_anything_else;
1562				break;
1563
1564			/*
1565			 * > A start tag whose tag name is "html"
1566			 */
1567			case '+HTML':
1568				$this->insert_html_element( $this->state->current_token );
1569				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
1570				return true;
1571
1572			/*
1573			 * > An end tag whose tag name is one of: "head", "body", "html", "br"
1574			 *
1575			 * Closing BR tags are always reported by the Tag Processor as opening tags.
1576			 */
1577			case '-HEAD':
1578			case '-BODY':
1579			case '-HTML':
1580				/*
1581				 * > Act as described in the "anything else" entry below.
1582				 */
1583				goto before_html_anything_else;
1584				break;
1585		}
1586
1587		/*
1588		 * > Any other end tag
1589		 */
1590		if ( $is_closer ) {
1591			// Parse error: ignore the token.
1592			return $this->step();
1593		}
1594
1595		/*
1596		 * > Anything else.
1597		 *
1598		 * > Create an html element whose node document is the Document object.
1599		 * > Append it to the Document object. Put this element in the stack of open elements.
1600		 * > Switch the insertion mode to "before head", then reprocess the token.
1601		 */
1602		before_html_anything_else:
1603		$this->insert_virtual_node( 'HTML' );
1604		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
1605		return $this->step( self::REPROCESS_CURRENT_NODE );
1606	}
1607
1608	/**
1609	 * Parses next element in the 'before head' insertion mode.
1610	 *
1611	 * This internal function performs the 'before head' insertion mode
1612	 * logic for the generalized WP_HTML_Processor::step() function.
1613	 *
1614	 * @since 6.7.0 Stub implementation.
1615	 *
1616	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1617	 *
1618	 * @see https://html.spec.whatwg.org/#the-before-head-insertion-mode
1619	 * @see WP_HTML_Processor::step
1620	 *
1621	 * @return bool Whether an element was found.
1622	 */
1623	private function step_before_head(): bool {
1624		$token_name = $this->get_token_name();
1625		$token_type = $this->get_token_type();
1626		$is_closer  = parent::is_tag_closer();
1627		$op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1628		$op         = "{$op_sigil}{$token_name}";
1629
1630		switch ( $op ) {
1631			/*
1632			 * > A character token that is one of U+0009 CHARACTER TABULATION,
1633			 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1634			 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1635			 *
1636			 * Parse error: ignore the token.
1637			 */
1638			case '#text':
1639				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1640					return $this->step();
1641				}
1642				goto before_head_anything_else;
1643				break;
1644
1645			/*
1646			 * > A comment token
1647			 */
1648			case '#comment':
1649			case '#funky-comment':
1650			case '#presumptuous-tag':
1651				$this->insert_html_element( $this->state->current_token );
1652				return true;
1653
1654			/*
1655			 * > A DOCTYPE token
1656			 */
1657			case 'html':
1658				// Parse error: ignore the token.
1659				return $this->step();
1660
1661			/*
1662			 * > A start tag whose tag name is "html"
1663			 */
1664			case '+HTML':
1665				return $this->step_in_body();
1666
1667			/*
1668			 * > A start tag whose tag name is "head"
1669			 */
1670			case '+HEAD':
1671				$this->insert_html_element( $this->state->current_token );
1672				$this->state->head_element   = $this->state->current_token;
1673				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1674				return true;
1675
1676			/*
1677			 * > An end tag whose tag name is one of: "head", "body", "html", "br"
1678			 * > Act as described in the "anything else" entry below.
1679			 *
1680			 * Closing BR tags are always reported by the Tag Processor as opening tags.
1681			 */
1682			case '-HEAD':
1683			case '-BODY':
1684			case '-HTML':
1685				goto before_head_anything_else;
1686				break;
1687		}
1688
1689		if ( $is_closer ) {
1690			// Parse error: ignore the token.
1691			return $this->step();
1692		}
1693
1694		/*
1695		 * > Anything else
1696		 *
1697		 * > Insert an HTML element for a "head" start tag token with no attributes.
1698		 */
1699		before_head_anything_else:
1700		$this->state->head_element   = $this->insert_virtual_node( 'HEAD' );
1701		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1702		return $this->step( self::REPROCESS_CURRENT_NODE );
1703	}
1704
1705	/**
1706	 * Parses next element in the 'in head' insertion mode.
1707	 *
1708	 * This internal function performs the 'in head' insertion mode
1709	 * logic for the generalized WP_HTML_Processor::step() function.
1710	 *
1711	 * @since 6.7.0
1712	 *
1713	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1714	 *
1715	 * @see https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inhead
1716	 * @see WP_HTML_Processor::step
1717	 *
1718	 * @return bool Whether an element was found.
1719	 */
1720	private function step_in_head(): bool {
1721		$token_name = $this->get_token_name();
1722		$token_type = $this->get_token_type();
1723		$is_closer  = parent::is_tag_closer();
1724		$op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1725		$op         = "{$op_sigil}{$token_name}";
1726
1727		switch ( $op ) {
1728			case '#text':
1729				/*
1730				 * > A character token that is one of U+0009 CHARACTER TABULATION,
1731				 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1732				 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1733				 */
1734				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1735					// Insert the character.
1736					$this->insert_html_element( $this->state->current_token );
1737					return true;
1738				}
1739
1740				goto in_head_anything_else;
1741				break;
1742
1743			/*
1744			 * > A comment token
1745			 */
1746			case '#comment':
1747			case '#funky-comment':
1748			case '#presumptuous-tag':
1749				$this->insert_html_element( $this->state->current_token );
1750				return true;
1751
1752			/*
1753			 * > A DOCTYPE token
1754			 */
1755			case 'html':
1756				// Parse error: ignore the token.
1757				return $this->step();
1758
1759			/*
1760			 * > A start tag whose tag name is "html"
1761			 */
1762			case '+HTML':
1763				return $this->step_in_body();
1764
1765			/*
1766			 * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link"
1767			 */
1768			case '+BASE':
1769			case '+BASEFONT':
1770			case '+BGSOUND':
1771			case '+LINK':
1772				$this->insert_html_element( $this->state->current_token );
1773				return true;
1774
1775			/*
1776			 * > A start tag whose tag name is "meta"
1777			 */
1778			case '+META':
1779				$this->insert_html_element( $this->state->current_token );
1780
1781				// All following conditions depend on "tentative" encoding confidence.
1782				if ( 'tentative' !== $this->state->encoding_confidence ) {
1783					return true;
1784				}
1785
1786				/*
1787				 * > If the active speculative HTML parser is null, then:
1788				 * >   - If the element has a charset attribute, and getting an encoding from
1789				 * >     its value results in an encoding, and the confidence is currently
1790				 * >     tentative, then change the encoding to the resulting encoding.
1791				 */
1792				$charset = $this->get_attribute( 'charset' );
1793				if ( is_string( $charset ) ) {
1794					$this->bail( 'Cannot yet process META tags with charset to determine encoding.' );
1795				}
1796
1797				/*
1798				 * >   - Otherwise, if the element has an http-equiv attribute whose value is
1799				 * >     an ASCII case-insensitive match for the string "Content-Type", and
1800				 * >     the element has a content attribute, and applying the algorithm for
1801				 * >     extracting a character encoding from a meta element to that attribute's
1802				 * >     value returns an encoding, and the confidence is currently tentative,
1803				 * >     then change the encoding to the extracted encoding.
1804				 */
1805				$http_equiv = $this->get_attribute( 'http-equiv' );
1806				$content    = $this->get_attribute( 'content' );
1807				if (
1808					is_string( $http_equiv ) &&
1809					is_string( $content ) &&
1810					0 === strcasecmp( $http_equiv, 'Content-Type' )
1811				) {
1812					$this->bail( 'Cannot yet process META tags with http-equiv Content-Type to determine encoding.' );
1813				}
1814
1815				return true;
1816
1817			/*
1818			 * > A start tag whose tag name is "title"
1819			 */
1820			case '+TITLE':
1821				$this->insert_html_element( $this->state->current_token );
1822				return true;
1823
1824			/*
1825			 * > A start tag whose tag name is "noscript", if the scripting flag is enabled
1826			 * > A start tag whose tag name is one of: "noframes", "style"
1827			 *
1828			 * The scripting flag is never enabled in this parser.
1829			 */
1830			case '+NOFRAMES':
1831			case '+STYLE':
1832				$this->insert_html_element( $this->state->current_token );
1833				return true;
1834
1835			/*
1836			 * > A start tag whose tag name is "noscript", if the scripting flag is disabled
1837			 */
1838			case '+NOSCRIPT':
1839				$this->insert_html_element( $this->state->current_token );
1840				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT;
1841				return true;
1842
1843			/*
1844			 * > A start tag whose tag name is "script"
1845			 *
1846			 * @todo Could the adjusted insertion location be anything other than the current location?
1847			 */
1848			case '+SCRIPT':
1849				$this->insert_html_element( $this->state->current_token );
1850				return true;
1851
1852			/*
1853			 * > An end tag whose tag name is "head"
1854			 */
1855			case '-HEAD':
1856				$this->state->stack_of_open_elements->pop();
1857				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD;
1858				return true;
1859
1860			/*
1861			 * > An end tag whose tag name is one of: "body", "html", "br"
1862			 *
1863			 * BR tags are always reported by the Tag Processor as opening tags.
1864			 */
1865			case '-BODY':
1866			case '-HTML':
1867				/*
1868				 * > Act as described in the "anything else" entry below.
1869				 */
1870				goto in_head_anything_else;
1871				break;
1872
1873			/*
1874			 * > A start tag whose tag name is "template"
1875			 *
1876			 * @todo Could the adjusted insertion location be anything other than the current location?
1877			 */
1878			case '+TEMPLATE':
1879				$this->state->active_formatting_elements->insert_marker();
1880				$this->state->frameset_ok = false;
1881
1882				$this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
1883				$this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
1884
1885				$this->insert_html_element( $this->state->current_token );
1886				return true;
1887
1888			/*
1889			 * > An end tag whose tag name is "template"
1890			 */
1891			case '-TEMPLATE':
1892				if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
1893					// @todo Indicate a parse error once it's possible.
1894					return $this->step();
1895				}
1896
1897				$this->generate_implied_end_tags_thoroughly();
1898				if ( ! $this->state->stack_of_open_elements->current_node_is( 'TEMPLATE' ) ) {
1899					// @todo Indicate a parse error once it's possible.
1900				}
1901
1902				$this->state->stack_of_open_elements->pop_until( 'TEMPLATE' );
1903				$this->state->active_formatting_elements->clear_up_to_last_marker();
1904				array_pop( $this->state->stack_of_template_insertion_modes );
1905				$this->reset_insertion_mode_appropriately();
1906				return true;
1907		}
1908
1909		/*
1910		 * > A start tag whose tag name is "head"
1911		 * > Any other end tag
1912		 */
1913		if ( '+HEAD' === $op || $is_closer ) {
1914			// Parse error: ignore the token.
1915			return $this->step();
1916		}
1917
1918		/*
1919		 * > Anything else
1920		 */
1921		in_head_anything_else:
1922		$this->state->stack_of_open_elements->pop();
1923		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD;
1924		return $this->step( self::REPROCESS_CURRENT_NODE );
1925	}
1926
1927	/**
1928	 * Parses next element in the 'in head noscript' insertion mode.
1929	 *
1930	 * This internal function performs the 'in head noscript' insertion mode
1931	 * logic for the generalized WP_HTML_Processor::step() function.
1932	 *
1933	 * @since 6.7.0 Stub implementation.
1934	 *
1935	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
1936	 *
1937	 * @see https://html.spec.whatwg.org/#parsing-main-inheadnoscript
1938	 * @see WP_HTML_Processor::step
1939	 *
1940	 * @return bool Whether an element was found.
1941	 */
1942	private function step_in_head_noscript(): bool {
1943		$token_name = $this->get_token_name();
1944		$token_type = $this->get_token_type();
1945		$is_closer  = parent::is_tag_closer();
1946		$op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
1947		$op         = "{$op_sigil}{$token_name}";
1948
1949		switch ( $op ) {
1950			/*
1951			 * > A character token that is one of U+0009 CHARACTER TABULATION,
1952			 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
1953			 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
1954			 *
1955			 * Parse error: ignore the token.
1956			 */
1957			case '#text':
1958				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
1959					return $this->step_in_head();
1960				}
1961
1962				goto in_head_noscript_anything_else;
1963				break;
1964
1965			/*
1966			 * > A DOCTYPE token
1967			 */
1968			case 'html':
1969				// Parse error: ignore the token.
1970				return $this->step();
1971
1972			/*
1973			 * > A start tag whose tag name is "html"
1974			 */
1975			case '+HTML':
1976				return $this->step_in_body();
1977
1978			/*
1979			 * > An end tag whose tag name is "noscript"
1980			 */
1981			case '-NOSCRIPT':
1982				$this->state->stack_of_open_elements->pop();
1983				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
1984				return true;
1985
1986			/*
1987			 * > A comment token
1988			 * >
1989			 * > A start tag whose tag name is one of: "basefont", "bgsound",
1990			 * > "link", "meta", "noframes", "style"
1991			 */
1992			case '#comment':
1993			case '#funky-comment':
1994			case '#presumptuous-tag':
1995			case '+BASEFONT':
1996			case '+BGSOUND':
1997			case '+LINK':
1998			case '+META':
1999			case '+NOFRAMES':
2000			case '+STYLE':
2001				return $this->step_in_head();
2002
2003			/*
2004			 * > An end tag whose tag name is "br"
2005			 *
2006			 * This should never happen, as the Tag Processor prevents showing a BR closing tag.
2007			 */
2008		}
2009
2010		/*
2011		 * > A start tag whose tag name is one of: "head", "noscript"
2012		 * > Any other end tag
2013		 */
2014		if ( '+HEAD' === $op || '+NOSCRIPT' === $op || $is_closer ) {
2015			// Parse error: ignore the token.
2016			return $this->step();
2017		}
2018
2019		/*
2020		 * > Anything else
2021		 *
2022		 * Anything here is a parse error.
2023		 */
2024		in_head_noscript_anything_else:
2025		$this->state->stack_of_open_elements->pop();
2026		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
2027		return $this->step( self::REPROCESS_CURRENT_NODE );
2028	}
2029
2030	/**
2031	 * Parses next element in the 'after head' insertion mode.
2032	 *
2033	 * This internal function performs the 'after head' insertion mode
2034	 * logic for the generalized WP_HTML_Processor::step() function.
2035	 *
2036	 * @since 6.7.0 Stub implementation.
2037	 *
2038	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
2039	 *
2040	 * @see https://html.spec.whatwg.org/#the-after-head-insertion-mode
2041	 * @see WP_HTML_Processor::step
2042	 *
2043	 * @return bool Whether an element was found.
2044	 */
2045	private function step_after_head(): bool {
2046		$token_name = $this->get_token_name();
2047		$token_type = $this->get_token_type();
2048		$is_closer  = parent::is_tag_closer();
2049		$op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
2050		$op         = "{$op_sigil}{$token_name}";
2051
2052		switch ( $op ) {
2053			/*
2054			 * > A character token that is one of U+0009 CHARACTER TABULATION,
2055			 * > U+000A LINE FEED (LF), U+000C FORM FEED (FF),
2056			 * > U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
2057			 */
2058			case '#text':
2059				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
2060					// Insert the character.
2061					$this->insert_html_element( $this->state->current_token );
2062					return true;
2063				}
2064				goto after_head_anything_else;
2065				break;
2066
2067			/*
2068			 * > A comment token
2069			 */
2070			case '#comment':
2071			case '#funky-comment':
2072			case '#presumptuous-tag':
2073				$this->insert_html_element( $this->state->current_token );
2074				return true;
2075
2076			/*
2077			 * > A DOCTYPE token
2078			 */
2079			case 'html':
2080				// Parse error: ignore the token.
2081				return $this->step();
2082
2083			/*
2084			 * > A start tag whose tag name is "html"
2085			 */
2086			case '+HTML':
2087				return $this->step_in_body();
2088
2089			/*
2090			 * > A start tag whose tag name is "body"
2091			 */
2092			case '+BODY':
2093				$this->insert_html_element( $this->state->current_token );
2094				$this->state->frameset_ok    = false;
2095				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
2096				return true;
2097
2098			/*
2099			 * > A start tag whose tag name is "frameset"
2100			 */
2101			case '+FRAMESET':
2102				$this->insert_html_element( $this->state->current_token );
2103				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET;
2104				return true;
2105
2106			/*
2107			 * > A start tag whose tag name is one of: "base", "basefont", "bgsound",
2108			 * > "link", "meta", "noframes", "script", "style", "template", "title"
2109			 *
2110			 * Anything here is a parse error.
2111			 */
2112			case '+BASE':
2113			case '+BASEFONT':
2114			case '+BGSOUND':
2115			case '+LINK':
2116			case '+META':
2117			case '+NOFRAMES':
2118			case '+SCRIPT':
2119			case '+STYLE':
2120			case '+TEMPLATE':
2121			case '+TITLE':
2122				/*
2123				 * > Push the node pointed to by the head element pointer onto the stack of open elements.
2124				 * > Process the token using the rules for the "in head" insertion mode.
2125				 * > Remove the node pointed to by the head element pointer from the stack of open elements. (It might not be the current node at this point.)
2126				 */
2127				$this->bail( 'Cannot process elements after HEAD which reopen the HEAD element.' );
2128				/*
2129				 * Do not leave this break in when adding support; it's here to prevent
2130				 * WPCS from getting confused at the switch structure without a return,
2131				 * because it doesn't know that `bail()` always throws.
2132				 */
2133				break;
2134
2135			/*
2136			 * > An end tag whose tag name is "template"
2137			 */
2138			case '-TEMPLATE':
2139				return $this->step_in_head();
2140
2141			/*
2142			 * > An end tag whose tag name is one of: "body", "html", "br"
2143			 *
2144			 * Closing BR tags are always reported by the Tag Processor as opening tags.
2145			 */
2146			case '-BODY':
2147			case '-HTML':
2148				/*
2149				 * > Act as described in the "anything else" entry below.
2150				 */
2151				goto after_head_anything_else;
2152				break;
2153		}
2154
2155		/*
2156		 * > A start tag whose tag name is "head"
2157		 * > Any other end tag
2158		 */
2159		if ( '+HEAD' === $op || $is_closer ) {
2160			// Parse error: ignore the token.
2161			return $this->step();
2162		}
2163
2164		/*
2165		 * > Anything else
2166		 * > Insert an HTML element for a "body" start tag token with no attributes.
2167		 */
2168		after_head_anything_else:
2169		$this->insert_virtual_node( 'BODY' );
2170		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
2171		return $this->step( self::REPROCESS_CURRENT_NODE );
2172	}
2173
2174	/**
2175	 * Parses next element in the 'in body' insertion mode.
2176	 *
2177	 * This internal function performs the 'in body' insertion mode
2178	 * logic for the generalized WP_HTML_Processor::step() function.
2179	 *
2180	 * @since 6.4.0
2181	 *
2182	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
2183	 *
2184	 * @see https://html.spec.whatwg.org/#parsing-main-inbody
2185	 * @see WP_HTML_Processor::step
2186	 *
2187	 * @return bool Whether an element was found.
2188	 */
2189	private function step_in_body(): bool {
2190		$token_name = $this->get_token_name();
2191		$token_type = $this->get_token_type();
2192		$op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
2193		$op         = "{$op_sigil}{$token_name}";
2194
2195		switch ( $op ) {
2196			case '#text':
2197				/*
2198				 * > A character token that is U+0000 NULL
2199				 *
2200				 * Any successive sequence of NULL bytes is ignored and won't
2201				 * trigger active format reconstruction. Therefore, if the text
2202				 * only comprises NULL bytes then the token should be ignored
2203				 * here, but if there are any other characters in the stream
2204				 * the active formats should be reconstructed.
2205				 */
2206				if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
2207					// Parse error: ignore the token.
2208					return $this->step();
2209				}
2210
2211				$this->reconstruct_active_formatting_elements();
2212
2213				/*
2214				 * Whitespace-only text does not affect the frameset-ok flag.
2215				 * It is probably inter-element whitespace, but it may also
2216				 * contain character references which decode only to whitespace.
2217				 */
2218				if ( parent::TEXT_IS_GENERIC === $this->text_node_classification ) {
2219					$this->state->frameset_ok = false;
2220				}
2221
2222				$this->insert_html_element( $this->state->current_token );
2223				return true;
2224
2225			case '#comment':
2226			case '#funky-comment':
2227			case '#presumptuous-tag':
2228				$this->insert_html_element( $this->state->current_token );
2229				return true;
2230
2231			/*
2232			 * > A DOCTYPE token
2233			 * > Parse error. Ignore the token.
2234			 */
2235			case 'html':
2236				return $this->step();
2237
2238			/*
2239			 * > A start tag whose tag name is "html"
2240			 */
2241			case '+HTML':
2242				if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
2243					/*
2244					 * > Otherwise, for each attribute on the token, check to see if the attribute
2245					 * > is already present on the top element of the stack of open elements. If
2246					 * > it is not, add the attribute and its corresponding value to that element.
2247					 *
2248					 * This parser does not currently support this behavior: ignore the token.
2249					 */
2250				}
2251
2252				// Ignore the token.
2253				return $this->step();
2254
2255			/*
2256			 * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link",
2257			 * > "meta", "noframes", "script", "style", "template", "title"
2258			 * >
2259			 * > An end tag whose tag name is "template"
2260			 */
2261			case '+BASE':
2262			case '+BASEFONT':
2263			case '+BGSOUND':
2264			case '+LINK':
2265			case '+META':
2266			case '+NOFRAMES':
2267			case '+SCRIPT':
2268			case '+STYLE':
2269			case '+TEMPLATE':
2270			case '+TITLE':
2271			case '-TEMPLATE':
2272				return $this->step_in_head();
2273
2274			/*
2275			 * > A start tag whose tag name is "body"
2276			 *
2277			 * This tag in the IN BODY insertion mode is a parse error.
2278			 */
2279			case '+BODY':
2280				if (
2281					1 === $this->state->stack_of_open_elements->count() ||
2282					'BODY' !== ( $this->state->stack_of_open_elements->at( 2 )->node_name ?? null ) ||
2283					$this->state->stack_of_open_elements->contains( 'TEMPLATE' )
2284				) {
2285					// Ignore the token.
2286					return $this->step();
2287				}
2288
2289				/*
2290				 * > Otherwise, set the frameset-ok flag to "not ok"; then, for each attribute
2291				 * > on the token, check to see if the attribute is already present on the body
2292				 * > element (the second element) on the stack of open elements, and if it is
2293				 * > not, add the attribute and its corresponding value to that element.
2294				 *
2295				 * This parser does not currently support this behavior: ignore the token.
2296				 */
2297				$this->state->frameset_ok = false;
2298				return $this->step();
2299
2300			/*
2301			 * > A start tag whose tag name is "frameset"
2302			 *
2303			 * This tag in the IN BODY insertion mode is a parse error.
2304			 */
2305			case '+FRAMESET':
2306				if (
2307					1 === $this->state->stack_of_open_elements->count() ||
2308					'BODY' !== ( $this->state->stack_of_open_elements->at( 2 )->node_name ?? null ) ||
2309					false === $this->state->frameset_ok
2310				) {
2311					// Ignore the token.
2312					return $this->step();
2313				}
2314
2315				/*
2316				 * > Otherwise, run the following steps:
2317				 */
2318				$this->bail( 'Cannot process non-ignored FRAMESET tags.' );
2319				break;
2320
2321			/*
2322			 * > An end tag whose tag name is "body"
2323			 */
2324			case '-BODY':
2325				if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
2326					// Parse error: ignore the token.
2327					return $this->step();
2328				}
2329
2330				/*
2331				 * > Otherwise, if there is a node in the stack of open elements that is not either a
2332				 * > dd element, a dt element, an li element, an optgroup element, an option element,
2333				 * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
2334				 * > element, a td element, a tfoot element, a th element, a thread element, a tr
2335				 * > element, the body element, or the html element, then this is a parse error.
2336				 *
2337				 * There is nothing to do for this parse error, so don't check for it.
2338				 */
2339
2340				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
2341				/*
2342				 * The BODY element is not removed from the stack of open elements.
2343				 * Only internal state has changed, this does not qualify as a "step"
2344				 * in terms of advancing through the document to another token.
2345				 * Nothing has been pushed or popped.
2346				 * Proceed to parse the next item.
2347				 */
2348				return $this->step();
2349
2350			/*
2351			 * > An end tag whose tag name is "html"
2352			 */
2353			case '-HTML':
2354				if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'BODY' ) ) {
2355					// Parse error: ignore the token.
2356					return $this->step();
2357				}
2358
2359				/*
2360				 * > Otherwise, if there is a node in the stack of open elements that is not either a
2361				 * > dd element, a dt element, an li element, an optgroup element, an option element,
2362				 * > a p element, an rb element, an rp element, an rt element, an rtc element, a tbody
2363				 * > element, a td element, a tfoot element, a th element, a thread element, a tr
2364				 * > element, the body element, or the html element, then this is a parse error.
2365				 *
2366				 * There is nothing to do for this parse error, so don't check for it.
2367				 */
2368
2369				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY;
2370				return $this->step( self::REPROCESS_CURRENT_NODE );
2371
2372			/*
2373			 * > A start tag whose tag name is one of: "address", "article", "aside",
2374			 * > "blockquote", "center", "details", "dialog", "dir", "div", "dl",
2375			 * > "fieldset", "figcaption", "figure", "footer", "header", "hgroup",
2376			 * > "main", "menu", "nav", "ol", "p", "search", "section", "summary", "ul"
2377			 */
2378			case '+ADDRESS':
2379			case '+ARTICLE':
2380			case '+ASIDE':
2381			case '+BLOCKQUOTE':
2382			case '+CENTER':
2383			case '+DETAILS':
2384			case '+DIALOG':
2385			case '+DIR':
2386			case '+DIV':
2387			case '+DL':
2388			case '+FIELDSET':
2389			case '+FIGCAPTION':
2390			case '+FIGURE':
2391			case '+FOOTER':
2392			case '+HEADER':
2393			case '+HGROUP':
2394			case '+MAIN':
2395			case '+MENU':
2396			case '+NAV':
2397			case '+OL':
2398			case '+P':
2399			case '+SEARCH':
2400			case '+SECTION':
2401			case '+SUMMARY':
2402			case '+UL':
2403				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2404					$this->close_a_p_element();
2405				}
2406
2407				$this->insert_html_element( $this->state->current_token );
2408				return true;
2409
2410			/*
2411			 * > A start tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
2412			 */
2413			case '+H1':
2414			case '+H2':
2415			case '+H3':
2416			case '+H4':
2417			case '+H5':
2418			case '+H6':
2419				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2420					$this->close_a_p_element();
2421				}
2422
2423				if (
2424					in_array(
2425						$this->state->stack_of_open_elements->current_node()->node_name,
2426						array( 'H1', 'H2', 'H3', 'H4', 'H5', 'H6' ),
2427						true
2428					)
2429				) {
2430					// @todo Indicate a parse error once it's possible.
2431					$this->state->stack_of_open_elements->pop();
2432				}
2433
2434				$this->insert_html_element( $this->state->current_token );
2435				return true;
2436
2437			/*
2438			 * > A start tag whose tag name is one of: "pre", "listing"
2439			 */
2440			case '+PRE':
2441			case '+LISTING':
2442				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2443					$this->close_a_p_element();
2444				}
2445
2446				/*
2447				 * > If the next token is a U+000A LINE FEED (LF) character token,
2448				 * > then ignore that token and move on to the next one. (Newlines
2449				 * > at the start of pre blocks are ignored as an authoring convenience.)
2450				 *
2451				 * This is handled in `get_modifiable_text()`.
2452				 */
2453
2454				$this->insert_html_element( $this->state->current_token );
2455				$this->state->frameset_ok = false;
2456				return true;
2457
2458			/*
2459			 * > A start tag whose tag name is "form"
2460			 */
2461			case '+FORM':
2462				$stack_contains_template = $this->state->stack_of_open_elements->contains( 'TEMPLATE' );
2463
2464				if ( isset( $this->state->form_element ) && ! $stack_contains_template ) {
2465					// Parse error: ignore the token.
2466					return $this->step();
2467				}
2468
2469				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2470					$this->close_a_p_element();
2471				}
2472
2473				$this->insert_html_element( $this->state->current_token );
2474				if ( ! $stack_contains_template ) {
2475					$this->state->form_element = $this->state->current_token;
2476				}
2477
2478				return true;
2479
2480			/*
2481			 * > A start tag whose tag name is "li"
2482			 * > A start tag whose tag name is one of: "dd", "dt"
2483			 */
2484			case '+DD':
2485			case '+DT':
2486			case '+LI':
2487				$this->state->frameset_ok = false;
2488				$node                     = $this->state->stack_of_open_elements->current_node();
2489				$is_li                    = 'LI' === $token_name;
2490
2491				in_body_list_loop:
2492				/*
2493				 * The logic for LI and DT/DD is the same except for one point: LI elements _only_
2494				 * close other LI elements, but a DT or DD element closes _any_ open DT or DD element.
2495				 */
2496				if ( $is_li ? 'LI' === $node->node_name : ( 'DD' === $node->node_name || 'DT' === $node->node_name ) ) {
2497					$node_name = $is_li ? 'LI' : $node->node_name;
2498					$this->generate_implied_end_tags( $node_name );
2499					if ( ! $this->state->stack_of_open_elements->current_node_is( $node_name ) ) {
2500						// @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2501					}
2502
2503					$this->state->stack_of_open_elements->pop_until( $node_name );
2504					goto in_body_list_done;
2505				}
2506
2507				if (
2508					'ADDRESS' !== $node->node_name &&
2509					'DIV' !== $node->node_name &&
2510					'P' !== $node->node_name &&
2511					self::is_special( $node )
2512				) {
2513					/*
2514					 * > If node is in the special category, but is not an address, div,
2515					 * > or p element, then jump to the step labeled done below.
2516					 */
2517					goto in_body_list_done;
2518				} else {
2519					/*
2520					 * > Otherwise, set node to the previous entry in the stack of open elements
2521					 * > and return to the step labeled loop.
2522					 */
2523					foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $item ) {
2524						$node = $item;
2525						break;
2526					}
2527					goto in_body_list_loop;
2528				}
2529
2530				in_body_list_done:
2531				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2532					$this->close_a_p_element();
2533				}
2534
2535				$this->insert_html_element( $this->state->current_token );
2536				return true;
2537
2538			case '+PLAINTEXT':
2539				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2540					$this->close_a_p_element();
2541				}
2542
2543				/*
2544				 * @todo This may need to be handled in the Tag Processor and turn into
2545				 *       a single self-contained tag like TEXTAREA, whose modifiable text
2546				 *       is the rest of the input document as plaintext.
2547				 */
2548				$this->bail( 'Cannot process PLAINTEXT elements.' );
2549				break;
2550
2551			/*
2552			 * > A start tag whose tag name is "button"
2553			 */
2554			case '+BUTTON':
2555				if ( $this->state->stack_of_open_elements->has_element_in_scope( 'BUTTON' ) ) {
2556					// @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2557					$this->generate_implied_end_tags();
2558					$this->state->stack_of_open_elements->pop_until( 'BUTTON' );
2559				}
2560
2561				$this->reconstruct_active_formatting_elements();
2562				$this->insert_html_element( $this->state->current_token );
2563				$this->state->frameset_ok = false;
2564
2565				return true;
2566
2567			/*
2568			 * > An end tag whose tag name is one of: "address", "article", "aside", "blockquote",
2569			 * > "button", "center", "details", "dialog", "dir", "div", "dl", "fieldset",
2570			 * > "figcaption", "figure", "footer", "header", "hgroup", "listing", "main",
2571			 * > "menu", "nav", "ol", "pre", "search", "section", "summary", "ul"
2572			 */
2573			case '-ADDRESS':
2574			case '-ARTICLE':
2575			case '-ASIDE':
2576			case '-BLOCKQUOTE':
2577			case '-BUTTON':
2578			case '-CENTER':
2579			case '-DETAILS':
2580			case '-DIALOG':
2581			case '-DIR':
2582			case '-DIV':
2583			case '-DL':
2584			case '-FIELDSET':
2585			case '-FIGCAPTION':
2586			case '-FIGURE':
2587			case '-FOOTER':
2588			case '-HEADER':
2589			case '-HGROUP':
2590			case '-LISTING':
2591			case '-MAIN':
2592			case '-MENU':
2593			case '-NAV':
2594			case '-OL':
2595			case '-PRE':
2596			case '-SEARCH':
2597			case '-SECTION':
2598			case '-SUMMARY':
2599			case '-UL':
2600				if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
2601					// @todo Report parse error.
2602					// Ignore the token.
2603					return $this->step();
2604				}
2605
2606				$this->generate_implied_end_tags();
2607				if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2608					// @todo Record parse error: this error doesn't impact parsing.
2609				}
2610				$this->state->stack_of_open_elements->pop_until( $token_name );
2611				return true;
2612
2613			/*
2614			 * > An end tag whose tag name is "form"
2615			 */
2616			case '-FORM':
2617				if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
2618					$node                      = $this->state->form_element;
2619					$this->state->form_element = null;
2620
2621					/*
2622					 * > If node is null or if the stack of open elements does not have node
2623					 * > in scope, then this is a parse error; return and ignore the token.
2624					 *
2625					 * @todo It's necessary to check if the form token itself is in scope, not
2626					 *       simply whether any FORM is in scope.
2627					 */
2628					if (
2629						null === $node ||
2630						! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' )
2631					) {
2632						// Parse error: ignore the token.
2633						return $this->step();
2634					}
2635
2636					$this->generate_implied_end_tags();
2637					if ( $node !== $this->state->stack_of_open_elements->current_node() ) {
2638						// @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2639						$this->bail( 'Cannot close a FORM when other elements remain open as this would throw off the breadcrumbs for the following tokens.' );
2640					}
2641
2642					$this->state->stack_of_open_elements->remove_node( $node );
2643					return true;
2644				} else {
2645					/*
2646					 * > If the stack of open elements does not have a form element in scope,
2647					 * > then this is a parse error; return and ignore the token.
2648					 *
2649					 * Note that unlike in the clause above, this is checking for any FORM in scope.
2650					 */
2651					if ( ! $this->state->stack_of_open_elements->has_element_in_scope( 'FORM' ) ) {
2652						// Parse error: ignore the token.
2653						return $this->step();
2654					}
2655
2656					$this->generate_implied_end_tags();
2657
2658					if ( ! $this->state->stack_of_open_elements->current_node_is( 'FORM' ) ) {
2659						// @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2660					}
2661
2662					$this->state->stack_of_open_elements->pop_until( 'FORM' );
2663					return true;
2664				}
2665				break;
2666
2667			/*
2668			 * > An end tag whose tag name is "p"
2669			 */
2670			case '-P':
2671				if ( ! $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2672					$this->insert_html_element( $this->state->current_token );
2673				}
2674
2675				$this->close_a_p_element();
2676				return true;
2677
2678			/*
2679			 * > An end tag whose tag name is "li"
2680			 * > An end tag whose tag name is one of: "dd", "dt"
2681			 */
2682			case '-DD':
2683			case '-DT':
2684			case '-LI':
2685				if (
2686					/*
2687					 * An end tag whose tag name is "li":
2688					 * If the stack of open elements does not have an li element in list item scope,
2689					 * then this is a parse error; ignore the token.
2690					 */
2691					(
2692						'LI' === $token_name &&
2693						! $this->state->stack_of_open_elements->has_element_in_list_item_scope( 'LI' )
2694					) ||
2695					/*
2696					 * An end tag whose tag name is one of: "dd", "dt":
2697					 * If the stack of open elements does not have an element in scope that is an
2698					 * HTML element with the same tag name as that of the token, then this is a
2699					 * parse error; ignore the token.
2700					 */
2701					(
2702						'LI' !== $token_name &&
2703						! $this->state->stack_of_open_elements->has_element_in_scope( $token_name )
2704					)
2705				) {
2706					/*
2707					 * This is a parse error, ignore the token.
2708					 *
2709					 * @todo Indicate a parse error once it's possible.
2710					 */
2711					return $this->step();
2712				}
2713
2714				$this->generate_implied_end_tags( $token_name );
2715
2716				if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2717					// @todo Indicate a parse error once it's possible. This error does not impact the logic here.
2718				}
2719
2720				$this->state->stack_of_open_elements->pop_until( $token_name );
2721				return true;
2722
2723			/*
2724			 * > An end tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6"
2725			 */
2726			case '-H1':
2727			case '-H2':
2728			case '-H3':
2729			case '-H4':
2730			case '-H5':
2731			case '-H6':
2732				if ( ! $this->state->stack_of_open_elements->has_element_in_scope( '(internal: H1 through H6 - do not use)' ) ) {
2733					/*
2734					 * This is a parse error; ignore the token.
2735					 *
2736					 * @todo Indicate a parse error once it's possible.
2737					 */
2738					return $this->step();
2739				}
2740
2741				$this->generate_implied_end_tags();
2742
2743				if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2744					// @todo Record parse error: this error doesn't impact parsing.
2745				}
2746
2747				$this->state->stack_of_open_elements->pop_until( '(internal: H1 through H6 - do not use)' );
2748				return true;
2749
2750			/*
2751			 * > A start tag whose tag name is "a"
2752			 */
2753			case '+A':
2754				foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
2755					switch ( $item->node_name ) {
2756						case 'marker':
2757							break 2;
2758
2759						case 'A':
2760							$this->run_adoption_agency_algorithm();
2761							$this->state->active_formatting_elements->remove_node( $item );
2762							$this->state->stack_of_open_elements->remove_node( $item );
2763							break 2;
2764					}
2765				}
2766
2767				$this->reconstruct_active_formatting_elements();
2768				$this->insert_html_element( $this->state->current_token );
2769				$this->state->active_formatting_elements->push( $this->state->current_token );
2770				return true;
2771
2772			/*
2773			 * > A start tag whose tag name is one of: "b", "big", "code", "em", "font", "i",
2774			 * > "s", "small", "strike", "strong", "tt", "u"
2775			 */
2776			case '+B':
2777			case '+BIG':
2778			case '+CODE':
2779			case '+EM':
2780			case '+FONT':
2781			case '+I':
2782			case '+S':
2783			case '+SMALL':
2784			case '+STRIKE':
2785			case '+STRONG':
2786			case '+TT':
2787			case '+U':
2788				$this->reconstruct_active_formatting_elements();
2789				$this->insert_html_element( $this->state->current_token );
2790				$this->state->active_formatting_elements->push( $this->state->current_token );
2791				return true;
2792
2793			/*
2794			 * > A start tag whose tag name is "nobr"
2795			 */
2796			case '+NOBR':
2797				$this->reconstruct_active_formatting_elements();
2798
2799				if ( $this->state->stack_of_open_elements->has_element_in_scope( 'NOBR' ) ) {
2800					// Parse error.
2801					$this->run_adoption_agency_algorithm();
2802					$this->reconstruct_active_formatting_elements();
2803				}
2804
2805				$this->insert_html_element( $this->state->current_token );
2806				$this->state->active_formatting_elements->push( $this->state->current_token );
2807				return true;
2808
2809			/*
2810			 * > An end tag whose tag name is one of: "a", "b", "big", "code", "em", "font", "i",
2811			 * > "nobr", "s", "small", "strike", "strong", "tt", "u"
2812			 */
2813			case '-A':
2814			case '-B':
2815			case '-BIG':
2816			case '-CODE':
2817			case '-EM':
2818			case '-FONT':
2819			case '-I':
2820			case '-NOBR':
2821			case '-S':
2822			case '-SMALL':
2823			case '-STRIKE':
2824			case '-STRONG':
2825			case '-TT':
2826			case '-U':
2827				$this->run_adoption_agency_algorithm();
2828				return true;
2829
2830			/*
2831			 * > A start tag whose tag name is one of: "applet", "marquee", "object"
2832			 */
2833			case '+APPLET':
2834			case '+MARQUEE':
2835			case '+OBJECT':
2836				$this->reconstruct_active_formatting_elements();
2837				$this->insert_html_element( $this->state->current_token );
2838				$this->state->active_formatting_elements->insert_marker();
2839				$this->state->frameset_ok = false;
2840				return true;
2841
2842			/*
2843			 * > A end tag token whose tag name is one of: "applet", "marquee", "object"
2844			 */
2845			case '-APPLET':
2846			case '-MARQUEE':
2847			case '-OBJECT':
2848				if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $token_name ) ) {
2849					// Parse error: ignore the token.
2850					return $this->step();
2851				}
2852
2853				$this->generate_implied_end_tags();
2854				if ( ! $this->state->stack_of_open_elements->current_node_is( $token_name ) ) {
2855					// This is a parse error.
2856				}
2857
2858				$this->state->stack_of_open_elements->pop_until( $token_name );
2859				$this->state->active_formatting_elements->clear_up_to_last_marker();
2860				return true;
2861
2862			/*
2863			 * > A start tag whose tag name is "table"
2864			 */
2865			case '+TABLE':
2866				/*
2867				 * > If the Document is not set to quirks mode, and the stack of open elements
2868				 * > has a p element in button scope, then close a p element.
2869				 */
2870				if (
2871					WP_HTML_Tag_Processor::QUIRKS_MODE !== $this->compat_mode &&
2872					$this->state->stack_of_open_elements->has_p_in_button_scope()
2873				) {
2874					$this->close_a_p_element();
2875				}
2876
2877				$this->insert_html_element( $this->state->current_token );
2878				$this->state->frameset_ok    = false;
2879				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
2880				return true;
2881
2882			/*
2883			 * > An end tag whose tag name is "br"
2884			 *
2885			 * This is prevented from happening because the Tag Processor
2886			 * reports all closing BR tags as if they were opening tags.
2887			 */
2888
2889			/*
2890			 * > A start tag whose tag name is one of: "area", "br", "embed", "img", "keygen", "wbr"
2891			 */
2892			case '+AREA':
2893			case '+BR':
2894			case '+EMBED':
2895			case '+IMG':
2896			case '+KEYGEN':
2897			case '+WBR':
2898				$this->reconstruct_active_formatting_elements();
2899				$this->insert_html_element( $this->state->current_token );
2900				$this->state->frameset_ok = false;
2901				return true;
2902
2903			/*
2904			 * > A start tag whose tag name is "input"
2905			 */
2906			case '+INPUT':
2907				$this->reconstruct_active_formatting_elements();
2908				$this->insert_html_element( $this->state->current_token );
2909
2910				/*
2911				 * > If the token does not have an attribute with the name "type", or if it does,
2912				 * > but that attribute's value is not an ASCII case-insensitive match for the
2913				 * > string "hidden", then: set the frameset-ok flag to "not ok".
2914				 */
2915				$type_attribute = $this->get_attribute( 'type' );
2916				if ( ! is_string( $type_attribute ) || 'hidden' !== strtolower( $type_attribute ) ) {
2917					$this->state->frameset_ok = false;
2918				}
2919
2920				return true;
2921
2922			/*
2923			 * > A start tag whose tag name is one of: "param", "source", "track"
2924			 */
2925			case '+PARAM':
2926			case '+SOURCE':
2927			case '+TRACK':
2928				$this->insert_html_element( $this->state->current_token );
2929				return true;
2930
2931			/*
2932			 * > A start tag whose tag name is "hr"
2933			 */
2934			case '+HR':
2935				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2936					$this->close_a_p_element();
2937				}
2938				$this->insert_html_element( $this->state->current_token );
2939				$this->state->frameset_ok = false;
2940				return true;
2941
2942			/*
2943			 * > A start tag whose tag name is "image"
2944			 */
2945			case '+IMAGE':
2946				/*
2947				 * > Parse error. Change the token's tag name to "img" and reprocess it. (Don't ask.)
2948				 *
2949				 * Note that this is handled elsewhere, so it should not be possible to reach this code.
2950				 */
2951				$this->bail( "Cannot process an IMAGE tag. (Don't ask.)" );
2952				break;
2953
2954			/*
2955			 * > A start tag whose tag name is "textarea"
2956			 */
2957			case '+TEXTAREA':
2958				$this->insert_html_element( $this->state->current_token );
2959
2960				/*
2961				 * > If the next token is a U+000A LINE FEED (LF) character token, then ignore
2962				 * > that token and move on to the next one. (Newlines at the start of
2963				 * > textarea elements are ignored as an authoring convenience.)
2964				 *
2965				 * This is handled in `get_modifiable_text()`.
2966				 */
2967
2968				$this->state->frameset_ok = false;
2969
2970				/*
2971				 * > Switch the insertion mode to "text".
2972				 *
2973				 * As a self-contained node, this behavior is handled in the Tag Processor.
2974				 */
2975				return true;
2976
2977			/*
2978			 * > A start tag whose tag name is "xmp"
2979			 */
2980			case '+XMP':
2981				if ( $this->state->stack_of_open_elements->has_p_in_button_scope() ) {
2982					$this->close_a_p_element();
2983				}
2984
2985				$this->reconstruct_active_formatting_elements();
2986				$this->state->frameset_ok = false;
2987
2988				/*
2989				 * > Follow the generic raw text element parsing algorithm.
2990				 *
2991				 * As a self-contained node, this behavior is handled in the Tag Processor.
2992				 */
2993				$this->insert_html_element( $this->state->current_token );
2994				return true;
2995
2996			/*
2997			 * A start tag whose tag name is "iframe"
2998			 */
2999			case '+IFRAME':
3000				$this->state->frameset_ok = false;
3001
3002				/*
3003				 * > Follow the generic raw text element parsing algorithm.
3004				 *
3005				 * As a self-contained node, this behavior is handled in the Tag Processor.
3006				 */
3007				$this->insert_html_element( $this->state->current_token );
3008				return true;
3009
3010			/*
3011			 * > A start tag whose tag name is "noembed"
3012			 * > A start tag whose tag name is "noscript", if the scripting flag is enabled
3013			 *
3014			 * The scripting flag is never enabled in this parser.
3015			 */
3016			case '+NOEMBED':
3017				$this->insert_html_element( $this->state->current_token );
3018				return true;
3019
3020			/*
3021			 * > A start tag whose tag name is "select"
3022			 */
3023			case '+SELECT':
3024				$this->reconstruct_active_formatting_elements();
3025				$this->insert_html_element( $this->state->current_token );
3026				$this->state->frameset_ok = false;
3027
3028				switch ( $this->state->insertion_mode ) {
3029					/*
3030					 * > If the insertion mode is one of "in table", "in caption", "in table body", "in row",
3031					 * > or "in cell", then switch the insertion mode to "in select in table".
3032					 */
3033					case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
3034					case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
3035					case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
3036					case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
3037					case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
3038						$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE;
3039						break;
3040
3041					/*
3042					 * > Otherwise, switch the insertion mode to "in select".
3043					 */
3044					default:
3045						$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT;
3046						break;
3047				}
3048				return true;
3049
3050			/*
3051			 * > A start tag whose tag name is one of: "optgroup", "option"
3052			 */
3053			case '+OPTGROUP':
3054			case '+OPTION':
3055				if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
3056					$this->state->stack_of_open_elements->pop();
3057				}
3058				$this->reconstruct_active_formatting_elements();
3059				$this->insert_html_element( $this->state->current_token );
3060				return true;
3061
3062			/*
3063			 * > A start tag whose tag name is one of: "rb", "rtc"
3064			 */
3065			case '+RB':
3066			case '+RTC':
3067				if ( $this->state->stack_of_open_elements->has_element_in_scope( 'RUBY' ) ) {
3068					$this->generate_implied_end_tags();
3069
3070					if ( $this->state->stack_of_open_elements->current_node_is( 'RUBY' ) ) {
3071						// @todo Indicate a parse error once it's possible.
3072					}
3073				}
3074
3075				$this->insert_html_element( $this->state->current_token );
3076				return true;
3077
3078			/*
3079			 * > A start tag whose tag name is one of: "rp", "rt"
3080			 */
3081			case '+RP':
3082			case '+RT':
3083				if ( $this->state->stack_of_open_elements->has_element_in_scope( 'RUBY' ) ) {
3084					$this->generate_implied_end_tags( 'RTC' );
3085
3086					$current_node_name = $this->state->stack_of_open_elements->current_node()->node_name;
3087					if ( 'RTC' === $current_node_name || 'RUBY' === $current_node_name ) {
3088						// @todo Indicate a parse error once it's possible.
3089					}
3090				}
3091
3092				$this->insert_html_element( $this->state->current_token );
3093				return true;
3094
3095			/*
3096			 * > A start tag whose tag name is "math"
3097			 */
3098			case '+MATH':
3099				$this->reconstruct_active_formatting_elements();
3100
3101				/*
3102				 * @todo Adjust MathML attributes for the token. (This fixes the case of MathML attributes that are not all lowercase.)
3103				 * @todo Adjust foreign attributes for the token. (This fixes the use of namespaced attributes, in particular XLink.)
3104				 *
3105				 * These ought to be handled in the attribute methods.
3106				 */
3107				$this->state->current_token->namespace = 'math';
3108				$this->insert_html_element( $this->state->current_token );
3109				if ( $this->state->current_token->has_self_closing_flag ) {
3110					$this->state->stack_of_open_elements->pop();
3111				}
3112				return true;
3113
3114			/*
3115			 * > A start tag whose tag name is "svg"
3116			 */
3117			case '+SVG':
3118				$this->reconstruct_active_formatting_elements();
3119
3120				/*
3121				 * @todo Adjust SVG attributes for the token. (This fixes the case of SVG attributes that are not all lowercase.)
3122				 * @todo Adjust foreign attributes for the token. (This fixes the use of namespaced attributes, in particular XLink in SVG.)
3123				 *
3124				 * These ought to be handled in the attribute methods.
3125				 */
3126				$this->state->current_token->namespace = 'svg';
3127				$this->insert_html_element( $this->state->current_token );
3128				if ( $this->state->current_token->has_self_closing_flag ) {
3129					$this->state->stack_of_open_elements->pop();
3130				}
3131				return true;
3132
3133			/*
3134			 * > A start tag whose tag name is one of: "caption", "col", "colgroup",
3135			 * > "frame", "head", "tbody", "td", "tfoot", "th", "thead", "tr"
3136			 */
3137			case '+CAPTION':
3138			case '+COL':
3139			case '+COLGROUP':
3140			case '+FRAME':
3141			case '+HEAD':
3142			case '+TBODY':
3143			case '+TD':
3144			case '+TFOOT':
3145			case '+TH':
3146			case '+THEAD':
3147			case '+TR':
3148				// Parse error. Ignore the token.
3149				return $this->step();
3150		}
3151
3152		if ( ! parent::is_tag_closer() ) {
3153			/*
3154			 * > Any other start tag
3155			 */
3156			$this->reconstruct_active_formatting_elements();
3157			$this->insert_html_element( $this->state->current_token );
3158			return true;
3159		} else {
3160			/*
3161			 * > Any other end tag
3162			 */
3163
3164			/*
3165			 * Find the corresponding tag opener in the stack of open elements, if
3166			 * it exists before reaching a special element, which provides a kind
3167			 * of boundary in the stack. For example, a `</custom-tag>` should not
3168			 * close anything beyond its containing `P` or `DIV` element.
3169			 */
3170			foreach ( $this->state->stack_of_open_elements->walk_up() as $node ) {
3171				if ( 'html' === $node->namespace && $token_name === $node->node_name ) {
3172					break;
3173				}
3174
3175				if ( self::is_special( $node ) ) {
3176					// This is a parse error, ignore the token.
3177					return $this->step();
3178				}
3179			}
3180
3181			$this->generate_implied_end_tags( $token_name );
3182			if ( $node !== $this->state->stack_of_open_elements->current_node() ) {
3183				// @todo Record parse error: this error doesn't impact parsing.
3184			}
3185
3186			foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
3187				$this->state->stack_of_open_elements->pop();
3188				if ( $node === $item ) {
3189					return true;
3190				}
3191			}
3192		}
3193
3194		$this->bail( 'Should not have been able to reach end of IN BODY processing. Check HTML API code.' );
3195		// This unnecessary return prevents tools from inaccurately reporting type errors.
3196		return false;
3197	}
3198
3199	/**
3200	 * Parses next element in the 'in table' insertion mode.
3201	 *
3202	 * This internal function performs the 'in table' insertion mode
3203	 * logic for the generalized WP_HTML_Processor::step() function.
3204	 *
3205	 * @since 6.7.0
3206	 *
3207	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3208	 *
3209	 * @see https://html.spec.whatwg.org/#parsing-main-intable
3210	 * @see WP_HTML_Processor::step
3211	 *
3212	 * @return bool Whether an element was found.
3213	 */
3214	private function step_in_table(): bool {
3215		$token_name = $this->get_token_name();
3216		$token_type = $this->get_token_type();
3217		$op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
3218		$op         = "{$op_sigil}{$token_name}";
3219
3220		switch ( $op ) {
3221			/*
3222			 * > A character token, if the current node is table,
3223			 * > tbody, template, tfoot, thead, or tr element
3224			 */
3225			case '#text':
3226				$current_node      = $this->state->stack_of_open_elements->current_node();
3227				$current_node_name = $current_node ? $current_node->node_name : null;
3228				if (
3229					$current_node_name && (
3230						'TABLE' === $current_node_name ||
3231						'TBODY' === $current_node_name ||
3232						'TEMPLATE' === $current_node_name ||
3233						'TFOOT' === $current_node_name ||
3234						'THEAD' === $current_node_name ||
3235						'TR' === $current_node_name
3236					)
3237				) {
3238					/*
3239					 * If the text is empty after processing HTML entities and stripping
3240					 * U+0000 NULL bytes then ignore the token.
3241					 */
3242					if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
3243						return $this->step();
3244					}
3245
3246					/*
3247					 * This follows the rules for "in table text" insertion mode.
3248					 *
3249					 * Whitespace-only text nodes are inserted in-place. Otherwise
3250					 * foster parenting is enabled and the nodes would be
3251					 * inserted out-of-place.
3252					 *
3253					 * > If any of the tokens in the pending table character tokens
3254					 * > list are character tokens that are not ASCII whitespace,
3255					 * > then this is a parse error: reprocess the character tokens
3256					 * > in the pending table character tokens list using the rules
3257					 * > given in the "anything else" entry in the "in table"
3258					 * > insertion mode.
3259					 * >
3260					 * > Otherwise, insert the characters given by the pending table
3261					 * > character tokens list.
3262					 *
3263					 * @see https://html.spec.whatwg.org/#parsing-main-intabletext
3264					 */
3265					if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
3266						$this->insert_html_element( $this->state->current_token );
3267						return true;
3268					}
3269
3270					// Non-whitespace would trigger fostering, unsupported at this time.
3271					$this->bail( 'Foster parenting is not supported.' );
3272					break;
3273				}
3274				break;
3275
3276			/*
3277			 * > A comment token
3278			 */
3279			case '#comment':
3280			case '#funky-comment':
3281			case '#presumptuous-tag':
3282				$this->insert_html_element( $this->state->current_token );
3283				return true;
3284
3285			/*
3286			 * > A DOCTYPE token
3287			 */
3288			case 'html':
3289				// Parse error: ignore the token.
3290				return $this->step();
3291
3292			/*
3293			 * > A start tag whose tag name is "caption"
3294			 */
3295			case '+CAPTION':
3296				$this->state->stack_of_open_elements->clear_to_table_context();
3297				$this->state->active_formatting_elements->insert_marker();
3298				$this->insert_html_element( $this->state->current_token );
3299				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION;
3300				return true;
3301
3302			/*
3303			 * > A start tag whose tag name is "colgroup"
3304			 */
3305			case '+COLGROUP':
3306				$this->state->stack_of_open_elements->clear_to_table_context();
3307				$this->insert_html_element( $this->state->current_token );
3308				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
3309				return true;
3310
3311			/*
3312			 * > A start tag whose tag name is "col"
3313			 */
3314			case '+COL':
3315				$this->state->stack_of_open_elements->clear_to_table_context();
3316
3317				/*
3318				 * > Insert an HTML element for a "colgroup" start tag token with no attributes,
3319				 * > then switch the insertion mode to "in column group".
3320				 */
3321				$this->insert_virtual_node( 'COLGROUP' );
3322				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
3323				return $this->step( self::REPROCESS_CURRENT_NODE );
3324
3325			/*
3326			 * > A start tag whose tag name is one of: "tbody", "tfoot", "thead"
3327			 */
3328			case '+TBODY':
3329			case '+TFOOT':
3330			case '+THEAD':
3331				$this->state->stack_of_open_elements->clear_to_table_context();
3332				$this->insert_html_element( $this->state->current_token );
3333				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3334				return true;
3335
3336			/*
3337			 * > A start tag whose tag name is one of: "td", "th", "tr"
3338			 */
3339			case '+TD':
3340			case '+TH':
3341			case '+TR':
3342				$this->state->stack_of_open_elements->clear_to_table_context();
3343				/*
3344				 * > Insert an HTML element for a "tbody" start tag token with no attributes,
3345				 * > then switch the insertion mode to "in table body".
3346				 */
3347				$this->insert_virtual_node( 'TBODY' );
3348				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3349				return $this->step( self::REPROCESS_CURRENT_NODE );
3350
3351			/*
3352			 * > A start tag whose tag name is "table"
3353			 *
3354			 * This tag in the IN TABLE insertion mode is a parse error.
3355			 */
3356			case '+TABLE':
3357				if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TABLE' ) ) {
3358					return $this->step();
3359				}
3360
3361				$this->state->stack_of_open_elements->pop_until( 'TABLE' );
3362				$this->reset_insertion_mode_appropriately();
3363				return $this->step( self::REPROCESS_CURRENT_NODE );
3364
3365			/*
3366			 * > An end tag whose tag name is "table"
3367			 */
3368			case '-TABLE':
3369				if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TABLE' ) ) {
3370					// @todo Indicate a parse error once it's possible.
3371					return $this->step();
3372				}
3373
3374				$this->state->stack_of_open_elements->pop_until( 'TABLE' );
3375				$this->reset_insertion_mode_appropriately();
3376				return true;
3377
3378			/*
3379			 * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "tbody", "td", "tfoot", "th", "thead", "tr"
3380			 */
3381			case '-BODY':
3382			case '-CAPTION':
3383			case '-COL':
3384			case '-COLGROUP':
3385			case '-HTML':
3386			case '-TBODY':
3387			case '-TD':
3388			case '-TFOOT':
3389			case '-TH':
3390			case '-THEAD':
3391			case '-TR':
3392				// Parse error: ignore the token.
3393				return $this->step();
3394
3395			/*
3396			 * > A start tag whose tag name is one of: "style", "script", "template"
3397			 * > An end tag whose tag name is "template"
3398			 */
3399			case '+STYLE':
3400			case '+SCRIPT':
3401			case '+TEMPLATE':
3402			case '-TEMPLATE':
3403				/*
3404				 * > Process the token using the rules for the "in head" insertion mode.
3405				 */
3406				return $this->step_in_head();
3407
3408			/*
3409			 * > A start tag whose tag name is "input"
3410			 *
3411			 * > If the token does not have an attribute with the name "type", or if it does, but
3412			 * > that attribute's value is not an ASCII case-insensitive match for the string
3413			 * > "hidden", then: act as described in the "anything else" entry below.
3414			 */
3415			case '+INPUT':
3416				$type_attribute = $this->get_attribute( 'type' );
3417				if ( ! is_string( $type_attribute ) || 'hidden' !== strtolower( $type_attribute ) ) {
3418					goto anything_else;
3419				}
3420				// @todo Indicate a parse error once it's possible.
3421				$this->insert_html_element( $this->state->current_token );
3422				return true;
3423
3424			/*
3425			 * > A start tag whose tag name is "form"
3426			 *
3427			 * This tag in the IN TABLE insertion mode is a parse error.
3428			 */
3429			case '+FORM':
3430				if (
3431					$this->state->stack_of_open_elements->has_element_in_scope( 'TEMPLATE' ) ||
3432					isset( $this->state->form_element )
3433				) {
3434					return $this->step();
3435				}
3436
3437				// This FORM is special because it immediately closes and cannot have other children.
3438				$this->insert_html_element( $this->state->current_token );
3439				$this->state->form_element = $this->state->current_token;
3440				$this->state->stack_of_open_elements->pop();
3441				return true;
3442		}
3443
3444		/*
3445		 * > Anything else
3446		 * > Parse error. Enable foster parenting, process the token using the rules for the
3447		 * > "in body" insertion mode, and then disable foster parenting.
3448		 *
3449		 * @todo Indicate a parse error once it's possible.
3450		 */
3451		anything_else:
3452		$this->bail( 'Foster parenting is not supported.' );
3453	}
3454
3455	/**
3456	 * Parses next element in the 'in table text' insertion mode.
3457	 *
3458	 * This internal function performs the 'in table text' insertion mode
3459	 * logic for the generalized WP_HTML_Processor::step() function.
3460	 *
3461	 * @since 6.7.0 Stub implementation.
3462	 *
3463	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3464	 *
3465	 * @see https://html.spec.whatwg.org/#parsing-main-intabletext
3466	 * @see WP_HTML_Processor::step
3467	 *
3468	 * @return bool Whether an element was found.
3469	 */
3470	private function step_in_table_text(): bool {
3471		$this->bail( 'No support for parsing in the ' . WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT . ' state.' );
3472	}
3473
3474	/**
3475	 * Parses next element in the 'in caption' insertion mode.
3476	 *
3477	 * This internal function performs the 'in caption' insertion mode
3478	 * logic for the generalized WP_HTML_Processor::step() function.
3479	 *
3480	 * @since 6.7.0
3481	 *
3482	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3483	 *
3484	 * @see https://html.spec.whatwg.org/#parsing-main-incaption
3485	 * @see WP_HTML_Processor::step
3486	 *
3487	 * @return bool Whether an element was found.
3488	 */
3489	private function step_in_caption(): bool {
3490		$tag_name = $this->get_tag();
3491		$op_sigil = $this->is_tag_closer() ? '-' : '+';
3492		$op       = "{$op_sigil}{$tag_name}";
3493
3494		switch ( $op ) {
3495			/*
3496			 * > An end tag whose tag name is "caption"
3497			 * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "td", "tfoot", "th", "thead", "tr"
3498			 * > An end tag whose tag name is "table"
3499			 *
3500			 * These tag handling rules are identical except for the final instruction.
3501			 * Handle them in a single block.
3502			 */
3503			case '-CAPTION':
3504			case '+CAPTION':
3505			case '+COL':
3506			case '+COLGROUP':
3507			case '+TBODY':
3508			case '+TD':
3509			case '+TFOOT':
3510			case '+TH':
3511			case '+THEAD':
3512			case '+TR':
3513			case '-TABLE':
3514				if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'CAPTION' ) ) {
3515					// Parse error: ignore the token.
3516					return $this->step();
3517				}
3518
3519				$this->generate_implied_end_tags();
3520				if ( ! $this->state->stack_of_open_elements->current_node_is( 'CAPTION' ) ) {
3521					// @todo Indicate a parse error once it's possible.
3522				}
3523
3524				$this->state->stack_of_open_elements->pop_until( 'CAPTION' );
3525				$this->state->active_formatting_elements->clear_up_to_last_marker();
3526				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3527
3528				// If this is not a CAPTION end tag, the token should be reprocessed.
3529				if ( '-CAPTION' === $op ) {
3530					return true;
3531				}
3532				return $this->step( self::REPROCESS_CURRENT_NODE );
3533
3534			/**
3535			 * > An end tag whose tag name is one of: "body", "col", "colgroup", "html", "tbody", "td", "tfoot", "th", "thead", "tr"
3536			 */
3537			case '-BODY':
3538			case '-COL':
3539			case '-COLGROUP':
3540			case '-HTML':
3541			case '-TBODY':
3542			case '-TD':
3543			case '-TFOOT':
3544			case '-TH':
3545			case '-THEAD':
3546			case '-TR':
3547				// Parse error: ignore the token.
3548				return $this->step();
3549		}
3550
3551		/**
3552		 * > Anything else
3553		 * >   Process the token using the rules for the "in body" insertion mode.
3554		 */
3555		return $this->step_in_body();
3556	}
3557
3558	/**
3559	 * Parses next element in the 'in column group' insertion mode.
3560	 *
3561	 * This internal function performs the 'in column group' insertion mode
3562	 * logic for the generalized WP_HTML_Processor::step() function.
3563	 *
3564	 * @since 6.7.0
3565	 *
3566	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3567	 *
3568	 * @see https://html.spec.whatwg.org/#parsing-main-incolgroup
3569	 * @see WP_HTML_Processor::step
3570	 *
3571	 * @return bool Whether an element was found.
3572	 */
3573	private function step_in_column_group(): bool {
3574		$token_name = $this->get_token_name();
3575		$token_type = $this->get_token_type();
3576		$op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
3577		$op         = "{$op_sigil}{$token_name}";
3578
3579		switch ( $op ) {
3580			/*
3581			 * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
3582			 * > U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
3583			 */
3584			case '#text':
3585				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
3586					// Insert the character.
3587					$this->insert_html_element( $this->state->current_token );
3588					return true;
3589				}
3590
3591				goto in_column_group_anything_else;
3592				break;
3593
3594			/*
3595			 * > A comment token
3596			 */
3597			case '#comment':
3598			case '#funky-comment':
3599			case '#presumptuous-tag':
3600				$this->insert_html_element( $this->state->current_token );
3601				return true;
3602
3603			/*
3604			 * > A DOCTYPE token
3605			 */
3606			case 'html':
3607				// @todo Indicate a parse error once it's possible.
3608				return $this->step();
3609
3610			/*
3611			 * > A start tag whose tag name is "html"
3612			 */
3613			case '+HTML':
3614				return $this->step_in_body();
3615
3616			/*
3617			 * > A start tag whose tag name is "col"
3618			 */
3619			case '+COL':
3620				$this->insert_html_element( $this->state->current_token );
3621				$this->state->stack_of_open_elements->pop();
3622				return true;
3623
3624			/*
3625			 * > An end tag whose tag name is "colgroup"
3626			 */
3627			case '-COLGROUP':
3628				if ( ! $this->state->stack_of_open_elements->current_node_is( 'COLGROUP' ) ) {
3629					// @todo Indicate a parse error once it's possible.
3630					return $this->step();
3631				}
3632				$this->state->stack_of_open_elements->pop();
3633				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3634				return true;
3635
3636			/*
3637			 * > An end tag whose tag name is "col"
3638			 */
3639			case '-COL':
3640				// Parse error: ignore the token.
3641				return $this->step();
3642
3643			/*
3644			 * > A start tag whose tag name is "template"
3645			 * > An end tag whose tag name is "template"
3646			 */
3647			case '+TEMPLATE':
3648			case '-TEMPLATE':
3649				return $this->step_in_head();
3650		}
3651
3652		in_column_group_anything_else:
3653		/*
3654		 * > Anything else
3655		 */
3656		if ( ! $this->state->stack_of_open_elements->current_node_is( 'COLGROUP' ) ) {
3657			// @todo Indicate a parse error once it's possible.
3658			return $this->step();
3659		}
3660		$this->state->stack_of_open_elements->pop();
3661		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3662		return $this->step( self::REPROCESS_CURRENT_NODE );
3663	}
3664
3665	/**
3666	 * Parses next element in the 'in table body' insertion mode.
3667	 *
3668	 * This internal function performs the 'in table body' insertion mode
3669	 * logic for the generalized WP_HTML_Processor::step() function.
3670	 *
3671	 * @since 6.7.0
3672	 *
3673	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3674	 *
3675	 * @see https://html.spec.whatwg.org/#parsing-main-intbody
3676	 * @see WP_HTML_Processor::step
3677	 *
3678	 * @return bool Whether an element was found.
3679	 */
3680	private function step_in_table_body(): bool {
3681		$tag_name = $this->get_tag();
3682		$op_sigil = $this->is_tag_closer() ? '-' : '+';
3683		$op       = "{$op_sigil}{$tag_name}";
3684
3685		switch ( $op ) {
3686			/*
3687			 * > A start tag whose tag name is "tr"
3688			 */
3689			case '+TR':
3690				$this->state->stack_of_open_elements->clear_to_table_body_context();
3691				$this->insert_html_element( $this->state->current_token );
3692				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3693				return true;
3694
3695			/*
3696			 * > A start tag whose tag name is one of: "th", "td"
3697			 */
3698			case '+TH':
3699			case '+TD':
3700				// @todo Indicate a parse error once it's possible.
3701				$this->state->stack_of_open_elements->clear_to_table_body_context();
3702				$this->insert_virtual_node( 'TR' );
3703				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3704				return $this->step( self::REPROCESS_CURRENT_NODE );
3705
3706			/*
3707			 * > An end tag whose tag name is one of: "tbody", "tfoot", "thead"
3708			 */
3709			case '-TBODY':
3710			case '-TFOOT':
3711			case '-THEAD':
3712				if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3713					// Parse error: ignore the token.
3714					return $this->step();
3715				}
3716
3717				$this->state->stack_of_open_elements->clear_to_table_body_context();
3718				$this->state->stack_of_open_elements->pop();
3719				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3720				return true;
3721
3722			/*
3723			 * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "tfoot", "thead"
3724			 * > An end tag whose tag name is "table"
3725			 */
3726			case '+CAPTION':
3727			case '+COL':
3728			case '+COLGROUP':
3729			case '+TBODY':
3730			case '+TFOOT':
3731			case '+THEAD':
3732			case '-TABLE':
3733				if (
3734					! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TBODY' ) &&
3735					! $this->state->stack_of_open_elements->has_element_in_table_scope( 'THEAD' ) &&
3736					! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TFOOT' )
3737				) {
3738					// Parse error: ignore the token.
3739					return $this->step();
3740				}
3741				$this->state->stack_of_open_elements->clear_to_table_body_context();
3742				$this->state->stack_of_open_elements->pop();
3743				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
3744				return $this->step( self::REPROCESS_CURRENT_NODE );
3745
3746			/*
3747			 * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "td", "th", "tr"
3748			 */
3749			case '-BODY':
3750			case '-CAPTION':
3751			case '-COL':
3752			case '-COLGROUP':
3753			case '-HTML':
3754			case '-TD':
3755			case '-TH':
3756			case '-TR':
3757				// Parse error: ignore the token.
3758				return $this->step();
3759		}
3760
3761		/*
3762		 * > Anything else
3763		 * > Process the token using the rules for the "in table" insertion mode.
3764		 */
3765		return $this->step_in_table();
3766	}
3767
3768	/**
3769	 * Parses next element in the 'in row' insertion mode.
3770	 *
3771	 * This internal function performs the 'in row' insertion mode
3772	 * logic for the generalized WP_HTML_Processor::step() function.
3773	 *
3774	 * @since 6.7.0
3775	 *
3776	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3777	 *
3778	 * @see https://html.spec.whatwg.org/#parsing-main-intr
3779	 * @see WP_HTML_Processor::step
3780	 *
3781	 * @return bool Whether an element was found.
3782	 */
3783	private function step_in_row(): bool {
3784		$tag_name = $this->get_tag();
3785		$op_sigil = $this->is_tag_closer() ? '-' : '+';
3786		$op       = "{$op_sigil}{$tag_name}";
3787
3788		switch ( $op ) {
3789			/*
3790			 * > A start tag whose tag name is one of: "th", "td"
3791			 */
3792			case '+TH':
3793			case '+TD':
3794				$this->state->stack_of_open_elements->clear_to_table_row_context();
3795				$this->insert_html_element( $this->state->current_token );
3796				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CELL;
3797				$this->state->active_formatting_elements->insert_marker();
3798				return true;
3799
3800			/*
3801			 * > An end tag whose tag name is "tr"
3802			 */
3803			case '-TR':
3804				if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3805					// Parse error: ignore the token.
3806					return $this->step();
3807				}
3808
3809				$this->state->stack_of_open_elements->clear_to_table_row_context();
3810				$this->state->stack_of_open_elements->pop();
3811				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3812				return true;
3813
3814			/*
3815			 * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "tfoot", "thead", "tr"
3816			 * > An end tag whose tag name is "table"
3817			 */
3818			case '+CAPTION':
3819			case '+COL':
3820			case '+COLGROUP':
3821			case '+TBODY':
3822			case '+TFOOT':
3823			case '+THEAD':
3824			case '+TR':
3825			case '-TABLE':
3826				if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3827					// Parse error: ignore the token.
3828					return $this->step();
3829				}
3830
3831				$this->state->stack_of_open_elements->clear_to_table_row_context();
3832				$this->state->stack_of_open_elements->pop();
3833				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3834				return $this->step( self::REPROCESS_CURRENT_NODE );
3835
3836			/*
3837			 * > An end tag whose tag name is one of: "tbody", "tfoot", "thead"
3838			 */
3839			case '-TBODY':
3840			case '-TFOOT':
3841			case '-THEAD':
3842				if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3843					// Parse error: ignore the token.
3844					return $this->step();
3845				}
3846
3847				if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( 'TR' ) ) {
3848					// Ignore the token.
3849					return $this->step();
3850				}
3851
3852				$this->state->stack_of_open_elements->clear_to_table_row_context();
3853				$this->state->stack_of_open_elements->pop();
3854				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
3855				return $this->step( self::REPROCESS_CURRENT_NODE );
3856
3857			/*
3858			 * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html", "td", "th"
3859			 */
3860			case '-BODY':
3861			case '-CAPTION':
3862			case '-COL':
3863			case '-COLGROUP':
3864			case '-HTML':
3865			case '-TD':
3866			case '-TH':
3867				// Parse error: ignore the token.
3868				return $this->step();
3869		}
3870
3871		/*
3872		 * > Anything else
3873		 * >   Process the token using the rules for the "in table" insertion mode.
3874		 */
3875		return $this->step_in_table();
3876	}
3877
3878	/**
3879	 * Parses next element in the 'in cell' insertion mode.
3880	 *
3881	 * This internal function performs the 'in cell' insertion mode
3882	 * logic for the generalized WP_HTML_Processor::step() function.
3883	 *
3884	 * @since 6.7.0
3885	 *
3886	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3887	 *
3888	 * @see https://html.spec.whatwg.org/#parsing-main-intd
3889	 * @see WP_HTML_Processor::step
3890	 *
3891	 * @return bool Whether an element was found.
3892	 */
3893	private function step_in_cell(): bool {
3894		$tag_name = $this->get_tag();
3895		$op_sigil = $this->is_tag_closer() ? '-' : '+';
3896		$op       = "{$op_sigil}{$tag_name}";
3897
3898		switch ( $op ) {
3899			/*
3900			 * > An end tag whose tag name is one of: "td", "th"
3901			 */
3902			case '-TD':
3903			case '-TH':
3904				if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3905					// Parse error: ignore the token.
3906					return $this->step();
3907				}
3908
3909				$this->generate_implied_end_tags();
3910
3911				/*
3912				 * @todo This needs to check if the current node is an HTML element, meaning that
3913				 *       when SVG and MathML support is added, this needs to differentiate between an
3914				 *       HTML element of the given name, such as `<center>`, and a foreign element of
3915				 *       the same given name.
3916				 */
3917				if ( ! $this->state->stack_of_open_elements->current_node_is( $tag_name ) ) {
3918					// @todo Indicate a parse error once it's possible.
3919				}
3920
3921				$this->state->stack_of_open_elements->pop_until( $tag_name );
3922				$this->state->active_formatting_elements->clear_up_to_last_marker();
3923				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
3924				return true;
3925
3926			/*
3927			 * > A start tag whose tag name is one of: "caption", "col", "colgroup", "tbody", "td",
3928			 * > "tfoot", "th", "thead", "tr"
3929			 */
3930			case '+CAPTION':
3931			case '+COL':
3932			case '+COLGROUP':
3933			case '+TBODY':
3934			case '+TD':
3935			case '+TFOOT':
3936			case '+TH':
3937			case '+THEAD':
3938			case '+TR':
3939				/*
3940				 * > Assert: The stack of open elements has a td or th element in table scope.
3941				 *
3942				 * Nothing to do here, except to verify in tests that this never appears.
3943				 */
3944
3945				$this->close_cell();
3946				return $this->step( self::REPROCESS_CURRENT_NODE );
3947
3948			/*
3949			 * > An end tag whose tag name is one of: "body", "caption", "col", "colgroup", "html"
3950			 */
3951			case '-BODY':
3952			case '-CAPTION':
3953			case '-COL':
3954			case '-COLGROUP':
3955			case '-HTML':
3956				// Parse error: ignore the token.
3957				return $this->step();
3958
3959			/*
3960			 * > An end tag whose tag name is one of: "table", "tbody", "tfoot", "thead", "tr"
3961			 */
3962			case '-TABLE':
3963			case '-TBODY':
3964			case '-TFOOT':
3965			case '-THEAD':
3966			case '-TR':
3967				if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $tag_name ) ) {
3968					// Parse error: ignore the token.
3969					return $this->step();
3970				}
3971				$this->close_cell();
3972				return $this->step( self::REPROCESS_CURRENT_NODE );
3973		}
3974
3975		/*
3976		 * > Anything else
3977		 * >   Process the token using the rules for the "in body" insertion mode.
3978		 */
3979		return $this->step_in_body();
3980	}
3981
3982	/**
3983	 * Parses next element in the 'in select' insertion mode.
3984	 *
3985	 * This internal function performs the 'in select' insertion mode
3986	 * logic for the generalized WP_HTML_Processor::step() function.
3987	 *
3988	 * @since 6.7.0
3989	 *
3990	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
3991	 *
3992	 * @see https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inselect
3993	 * @see WP_HTML_Processor::step
3994	 *
3995	 * @return bool Whether an element was found.
3996	 */
3997	private function step_in_select(): bool {
3998		$token_name = $this->get_token_name();
3999		$token_type = $this->get_token_type();
4000		$op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
4001		$op         = "{$op_sigil}{$token_name}";
4002
4003		switch ( $op ) {
4004			/*
4005			 * > Any other character token
4006			 */
4007			case '#text':
4008				/*
4009				 * > A character token that is U+0000 NULL
4010				 *
4011				 * If a text node only comprises null bytes then it should be
4012				 * entirely ignored and should not return to calling code.
4013				 */
4014				if ( parent::TEXT_IS_NULL_SEQUENCE === $this->text_node_classification ) {
4015					// Parse error: ignore the token.
4016					return $this->step();
4017				}
4018
4019				$this->insert_html_element( $this->state->current_token );
4020				return true;
4021
4022			/*
4023			 * > A comment token
4024			 */
4025			case '#comment':
4026			case '#funky-comment':
4027			case '#presumptuous-tag':
4028				$this->insert_html_element( $this->state->current_token );
4029				return true;
4030
4031			/*
4032			 * > A DOCTYPE token
4033			 */
4034			case 'html':
4035				// Parse error: ignore the token.
4036				return $this->step();
4037
4038			/*
4039			 * > A start tag whose tag name is "html"
4040			 */
4041			case '+HTML':
4042				return $this->step_in_body();
4043
4044			/*
4045			 * > A start tag whose tag name is "option"
4046			 */
4047			case '+OPTION':
4048				if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4049					$this->state->stack_of_open_elements->pop();
4050				}
4051				$this->insert_html_element( $this->state->current_token );
4052				return true;
4053
4054			/*
4055			 * > A start tag whose tag name is "optgroup"
4056			 * > A start tag whose tag name is "hr"
4057			 *
4058			 * These rules are identical except for the treatment of the self-closing flag and
4059			 * the subsequent pop of the HR void element, all of which is handled elsewhere in the processor.
4060			 */
4061			case '+OPTGROUP':
4062			case '+HR':
4063				if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4064					$this->state->stack_of_open_elements->pop();
4065				}
4066
4067				if ( $this->state->stack_of_open_elements->current_node_is( 'OPTGROUP' ) ) {
4068					$this->state->stack_of_open_elements->pop();
4069				}
4070
4071				$this->insert_html_element( $this->state->current_token );
4072				return true;
4073
4074			/*
4075			 * > An end tag whose tag name is "optgroup"
4076			 */
4077			case '-OPTGROUP':
4078				$current_node = $this->state->stack_of_open_elements->current_node();
4079				if ( $current_node && 'OPTION' === $current_node->node_name ) {
4080					foreach ( $this->state->stack_of_open_elements->walk_up( $current_node ) as $parent ) {
4081						break;
4082					}
4083					if ( $parent && 'OPTGROUP' === $parent->node_name ) {
4084						$this->state->stack_of_open_elements->pop();
4085					}
4086				}
4087
4088				if ( $this->state->stack_of_open_elements->current_node_is( 'OPTGROUP' ) ) {
4089					$this->state->stack_of_open_elements->pop();
4090					return true;
4091				}
4092
4093				// Parse error: ignore the token.
4094				return $this->step();
4095
4096			/*
4097			 * > An end tag whose tag name is "option"
4098			 */
4099			case '-OPTION':
4100				if ( $this->state->stack_of_open_elements->current_node_is( 'OPTION' ) ) {
4101					$this->state->stack_of_open_elements->pop();
4102					return true;
4103				}
4104
4105				// Parse error: ignore the token.
4106				return $this->step();
4107
4108			/*
4109			 * > An end tag whose tag name is "select"
4110			 * > A start tag whose tag name is "select"
4111			 *
4112			 * > It just gets treated like an end tag.
4113			 */
4114			case '-SELECT':
4115			case '+SELECT':
4116				if ( ! $this->state->stack_of_open_elements->has_element_in_select_scope( 'SELECT' ) ) {
4117					// Parse error: ignore the token.
4118					return $this->step();
4119				}
4120				$this->state->stack_of_open_elements->pop_until( 'SELECT' );
4121				$this->reset_insertion_mode_appropriately();
4122				return true;
4123
4124			/*
4125			 * > A start tag whose tag name is one of: "input", "keygen", "textarea"
4126			 *
4127			 * All three of these tags are considered a parse error when found in this insertion mode.
4128			 */
4129			case '+INPUT':
4130			case '+KEYGEN':
4131			case '+TEXTAREA':
4132				if ( ! $this->state->stack_of_open_elements->has_element_in_select_scope( 'SELECT' ) ) {
4133					// Ignore the token.
4134					return $this->step();
4135				}
4136				$this->state->stack_of_open_elements->pop_until( 'SELECT' );
4137				$this->reset_insertion_mode_appropriately();
4138				return $this->step( self::REPROCESS_CURRENT_NODE );
4139
4140			/*
4141			 * > A start tag whose tag name is one of: "script", "template"
4142			 * > An end tag whose tag name is "template"
4143			 */
4144			case '+SCRIPT':
4145			case '+TEMPLATE':
4146			case '-TEMPLATE':
4147				return $this->step_in_head();
4148		}
4149
4150		/*
4151		 * > Anything else
4152		 * >   Parse error: ignore the token.
4153		 */
4154		return $this->step();
4155	}
4156
4157	/**
4158	 * Parses next element in the 'in select in table' insertion mode.
4159	 *
4160	 * This internal function performs the 'in select in table' insertion mode
4161	 * logic for the generalized WP_HTML_Processor::step() function.
4162	 *
4163	 * @since 6.7.0
4164	 *
4165	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4166	 *
4167	 * @see https://html.spec.whatwg.org/#parsing-main-inselectintable
4168	 * @see WP_HTML_Processor::step
4169	 *
4170	 * @return bool Whether an element was found.
4171	 */
4172	private function step_in_select_in_table(): bool {
4173		$token_name = $this->get_token_name();
4174		$token_type = $this->get_token_type();
4175		$op_sigil   = '#tag' === $token_type ? ( parent::is_tag_closer() ? '-' : '+' ) : '';
4176		$op         = "{$op_sigil}{$token_name}";
4177
4178		switch ( $op ) {
4179			/*
4180			 * > A start tag whose tag name is one of: "caption", "table", "tbody", "tfoot", "thead", "tr", "td", "th"
4181			 */
4182			case '+CAPTION':
4183			case '+TABLE':
4184			case '+TBODY':
4185			case '+TFOOT':
4186			case '+THEAD':
4187			case '+TR':
4188			case '+TD':
4189			case '+TH':
4190				// @todo Indicate a parse error once it's possible.
4191				$this->state->stack_of_open_elements->pop_until( 'SELECT' );
4192				$this->reset_insertion_mode_appropriately();
4193				return $this->step( self::REPROCESS_CURRENT_NODE );
4194
4195			/*
4196			 * > An end tag whose tag name is one of: "caption", "table", "tbody", "tfoot", "thead", "tr", "td", "th"
4197			 */
4198			case '-CAPTION':
4199			case '-TABLE':
4200			case '-TBODY':
4201			case '-TFOOT':
4202			case '-THEAD':
4203			case '-TR':
4204			case '-TD':
4205			case '-TH':
4206				// @todo Indicate a parse error once it's possible.
4207				if ( ! $this->state->stack_of_open_elements->has_element_in_table_scope( $token_name ) ) {
4208					return $this->step();
4209				}
4210				$this->state->stack_of_open_elements->pop_until( 'SELECT' );
4211				$this->reset_insertion_mode_appropriately();
4212				return $this->step( self::REPROCESS_CURRENT_NODE );
4213		}
4214
4215		/*
4216		 * > Anything else
4217		 */
4218		return $this->step_in_select();
4219	}
4220
4221	/**
4222	 * Parses next element in the 'in template' insertion mode.
4223	 *
4224	 * This internal function performs the 'in template' insertion mode
4225	 * logic for the generalized WP_HTML_Processor::step() function.
4226	 *
4227	 * @since 6.7.0 Stub implementation.
4228	 *
4229	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4230	 *
4231	 * @see https://html.spec.whatwg.org/#parsing-main-intemplate
4232	 * @see WP_HTML_Processor::step
4233	 *
4234	 * @return bool Whether an element was found.
4235	 */
4236	private function step_in_template(): bool {
4237		$token_name = $this->get_token_name();
4238		$token_type = $this->get_token_type();
4239		$is_closer  = $this->is_tag_closer();
4240		$op_sigil   = '#tag' === $token_type ? ( $is_closer ? '-' : '+' ) : '';
4241		$op         = "{$op_sigil}{$token_name}";
4242
4243		switch ( $op ) {
4244			/*
4245			 * > A character token
4246			 * > A comment token
4247			 * > A DOCTYPE token
4248			 */
4249			case '#text':
4250			case '#comment':
4251			case '#funky-comment':
4252			case '#presumptuous-tag':
4253			case 'html':
4254				return $this->step_in_body();
4255
4256			/*
4257			 * > A start tag whose tag name is one of: "base", "basefont", "bgsound", "link",
4258			 * > "meta", "noframes", "script", "style", "template", "title"
4259			 * > An end tag whose tag name is "template"
4260			 */
4261			case '+BASE':
4262			case '+BASEFONT':
4263			case '+BGSOUND':
4264			case '+LINK':
4265			case '+META':
4266			case '+NOFRAMES':
4267			case '+SCRIPT':
4268			case '+STYLE':
4269			case '+TEMPLATE':
4270			case '+TITLE':
4271			case '-TEMPLATE':
4272				return $this->step_in_head();
4273
4274			/*
4275			 * > A start tag whose tag name is one of: "caption", "colgroup", "tbody", "tfoot", "thead"
4276			 */
4277			case '+CAPTION':
4278			case '+COLGROUP':
4279			case '+TBODY':
4280			case '+TFOOT':
4281			case '+THEAD':
4282				array_pop( $this->state->stack_of_template_insertion_modes );
4283				$this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
4284				$this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
4285				return $this->step( self::REPROCESS_CURRENT_NODE );
4286
4287			/*
4288			 * > A start tag whose tag name is "col"
4289			 */
4290			case '+COL':
4291				array_pop( $this->state->stack_of_template_insertion_modes );
4292				$this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
4293				$this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
4294				return $this->step( self::REPROCESS_CURRENT_NODE );
4295
4296			/*
4297			 * > A start tag whose tag name is "tr"
4298			 */
4299			case '+TR':
4300				array_pop( $this->state->stack_of_template_insertion_modes );
4301				$this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
4302				$this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
4303				return $this->step( self::REPROCESS_CURRENT_NODE );
4304
4305			/*
4306			 * > A start tag whose tag name is one of: "td", "th"
4307			 */
4308			case '+TD':
4309			case '+TH':
4310				array_pop( $this->state->stack_of_template_insertion_modes );
4311				$this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
4312				$this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
4313				return $this->step( self::REPROCESS_CURRENT_NODE );
4314		}
4315
4316		/*
4317		 * > Any other start tag
4318		 */
4319		if ( ! $is_closer ) {
4320			array_pop( $this->state->stack_of_template_insertion_modes );
4321			$this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4322			$this->state->insertion_mode                      = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4323			return $this->step( self::REPROCESS_CURRENT_NODE );
4324		}
4325
4326		/*
4327		 * > Any other end tag
4328		 */
4329		if ( $is_closer ) {
4330			// Parse error: ignore the token.
4331			return $this->step();
4332		}
4333
4334		/*
4335		 * > An end-of-file token
4336		 */
4337		if ( ! $this->state->stack_of_open_elements->contains( 'TEMPLATE' ) ) {
4338			// Stop parsing.
4339			return false;
4340		}
4341
4342		// @todo Indicate a parse error once it's possible.
4343		$this->state->stack_of_open_elements->pop_until( 'TEMPLATE' );
4344		$this->state->active_formatting_elements->clear_up_to_last_marker();
4345		array_pop( $this->state->stack_of_template_insertion_modes );
4346		$this->reset_insertion_mode_appropriately();
4347		return $this->step( self::REPROCESS_CURRENT_NODE );
4348	}
4349
4350	/**
4351	 * Parses next element in the 'after body' insertion mode.
4352	 *
4353	 * This internal function performs the 'after body' insertion mode
4354	 * logic for the generalized WP_HTML_Processor::step() function.
4355	 *
4356	 * @since 6.7.0 Stub implementation.
4357	 *
4358	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4359	 *
4360	 * @see https://html.spec.whatwg.org/#parsing-main-afterbody
4361	 * @see WP_HTML_Processor::step
4362	 *
4363	 * @return bool Whether an element was found.
4364	 */
4365	private function step_after_body(): bool {
4366		$tag_name   = $this->get_token_name();
4367		$token_type = $this->get_token_type();
4368		$op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4369		$op         = "{$op_sigil}{$tag_name}";
4370
4371		switch ( $op ) {
4372			/*
4373			 * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4374			 * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4375			 *
4376			 * > Process the token using the rules for the "in body" insertion mode.
4377			 */
4378			case '#text':
4379				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4380					return $this->step_in_body();
4381				}
4382				goto after_body_anything_else;
4383				break;
4384
4385			/*
4386			 * > A comment token
4387			 */
4388			case '#comment':
4389			case '#funky-comment':
4390			case '#presumptuous-tag':
4391				$this->bail( 'Content outside of BODY is unsupported.' );
4392				break;
4393
4394			/*
4395			 * > A DOCTYPE token
4396			 */
4397			case 'html':
4398				// Parse error: ignore the token.
4399				return $this->step();
4400
4401			/*
4402			 * > A start tag whose tag name is "html"
4403			 */
4404			case '+HTML':
4405				return $this->step_in_body();
4406
4407			/*
4408			 * > An end tag whose tag name is "html"
4409			 *
4410			 * > If the parser was created as part of the HTML fragment parsing algorithm,
4411			 * > this is a parse error; ignore the token. (fragment case)
4412			 * >
4413			 * > Otherwise, switch the insertion mode to "after after body".
4414			 */
4415			case '-HTML':
4416				if ( isset( $this->context_node ) ) {
4417					return $this->step();
4418				}
4419
4420				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY;
4421				/*
4422				 * The HTML element is not removed from the stack of open elements.
4423				 * Only internal state has changed, this does not qualify as a "step"
4424				 * in terms of advancing through the document to another token.
4425				 * Nothing has been pushed or popped.
4426				 * Proceed to parse the next item.
4427				 */
4428				return $this->step();
4429		}
4430
4431		/*
4432		 * > Parse error. Switch the insertion mode to "in body" and reprocess the token.
4433		 */
4434		after_body_anything_else:
4435		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4436		return $this->step( self::REPROCESS_CURRENT_NODE );
4437	}
4438
4439	/**
4440	 * Parses next element in the 'in frameset' insertion mode.
4441	 *
4442	 * This internal function performs the 'in frameset' insertion mode
4443	 * logic for the generalized WP_HTML_Processor::step() function.
4444	 *
4445	 * @since 6.7.0 Stub implementation.
4446	 *
4447	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4448	 *
4449	 * @see https://html.spec.whatwg.org/#parsing-main-inframeset
4450	 * @see WP_HTML_Processor::step
4451	 *
4452	 * @return bool Whether an element was found.
4453	 */
4454	private function step_in_frameset(): bool {
4455		$tag_name   = $this->get_token_name();
4456		$token_type = $this->get_token_type();
4457		$op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4458		$op         = "{$op_sigil}{$tag_name}";
4459
4460		switch ( $op ) {
4461			/*
4462			 * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4463			 * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4464			 * >
4465			 * > Insert the character.
4466			 *
4467			 * This algorithm effectively strips non-whitespace characters from text and inserts
4468			 * them under HTML. This is not supported at this time.
4469			 */
4470			case '#text':
4471				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4472					return $this->step_in_body();
4473				}
4474				$this->bail( 'Non-whitespace characters cannot be handled in frameset.' );
4475				break;
4476
4477			/*
4478			 * > A comment token
4479			 */
4480			case '#comment':
4481			case '#funky-comment':
4482			case '#presumptuous-tag':
4483				$this->insert_html_element( $this->state->current_token );
4484				return true;
4485
4486			/*
4487			 * > A DOCTYPE token
4488			 */
4489			case 'html':
4490				// Parse error: ignore the token.
4491				return $this->step();
4492
4493			/*
4494			 * > A start tag whose tag name is "html"
4495			 */
4496			case '+HTML':
4497				return $this->step_in_body();
4498
4499			/*
4500			 * > A start tag whose tag name is "frameset"
4501			 */
4502			case '+FRAMESET':
4503				$this->insert_html_element( $this->state->current_token );
4504				return true;
4505
4506			/*
4507			 * > An end tag whose tag name is "frameset"
4508			 */
4509			case '-FRAMESET':
4510				/*
4511				 * > If the current node is the root html element, then this is a parse error;
4512				 * > ignore the token. (fragment case)
4513				 */
4514				if ( $this->state->stack_of_open_elements->current_node_is( 'HTML' ) ) {
4515					return $this->step();
4516				}
4517
4518				/*
4519				 * > Otherwise, pop the current node from the stack of open elements.
4520				 */
4521				$this->state->stack_of_open_elements->pop();
4522
4523				/*
4524				 * > If the parser was not created as part of the HTML fragment parsing algorithm
4525				 * > (fragment case), and the current node is no longer a frameset element, then
4526				 * > switch the insertion mode to "after frameset".
4527				 */
4528				if ( ! isset( $this->context_node ) && ! $this->state->stack_of_open_elements->current_node_is( 'FRAMESET' ) ) {
4529					$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET;
4530				}
4531
4532				return true;
4533
4534			/*
4535			 * > A start tag whose tag name is "frame"
4536			 *
4537			 * > Insert an HTML element for the token. Immediately pop the
4538			 * > current node off the stack of open elements.
4539			 * >
4540			 * > Acknowledge the token's self-closing flag, if it is set.
4541			 */
4542			case '+FRAME':
4543				$this->insert_html_element( $this->state->current_token );
4544				$this->state->stack_of_open_elements->pop();
4545				return true;
4546
4547			/*
4548			 * > A start tag whose tag name is "noframes"
4549			 */
4550			case '+NOFRAMES':
4551				return $this->step_in_head();
4552		}
4553
4554		// Parse error: ignore the token.
4555		return $this->step();
4556	}
4557
4558	/**
4559	 * Parses next element in the 'after frameset' insertion mode.
4560	 *
4561	 * This internal function performs the 'after frameset' insertion mode
4562	 * logic for the generalized WP_HTML_Processor::step() function.
4563	 *
4564	 * @since 6.7.0 Stub implementation.
4565	 *
4566	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4567	 *
4568	 * @see https://html.spec.whatwg.org/#parsing-main-afterframeset
4569	 * @see WP_HTML_Processor::step
4570	 *
4571	 * @return bool Whether an element was found.
4572	 */
4573	private function step_after_frameset(): bool {
4574		$tag_name   = $this->get_token_name();
4575		$token_type = $this->get_token_type();
4576		$op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4577		$op         = "{$op_sigil}{$tag_name}";
4578
4579		switch ( $op ) {
4580			/*
4581			 * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4582			 * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4583			 * >
4584			 * > Insert the character.
4585			 *
4586			 * This algorithm effectively strips non-whitespace characters from text and inserts
4587			 * them under HTML. This is not supported at this time.
4588			 */
4589			case '#text':
4590				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4591					return $this->step_in_body();
4592				}
4593				$this->bail( 'Non-whitespace characters cannot be handled in after frameset' );
4594				break;
4595
4596			/*
4597			 * > A comment token
4598			 */
4599			case '#comment':
4600			case '#funky-comment':
4601			case '#presumptuous-tag':
4602				$this->insert_html_element( $this->state->current_token );
4603				return true;
4604
4605			/*
4606			 * > A DOCTYPE token
4607			 */
4608			case 'html':
4609				// Parse error: ignore the token.
4610				return $this->step();
4611
4612			/*
4613			 * > A start tag whose tag name is "html"
4614			 */
4615			case '+HTML':
4616				return $this->step_in_body();
4617
4618			/*
4619			 * > An end tag whose tag name is "html"
4620			 */
4621			case '-HTML':
4622				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET;
4623				/*
4624				 * The HTML element is not removed from the stack of open elements.
4625				 * Only internal state has changed, this does not qualify as a "step"
4626				 * in terms of advancing through the document to another token.
4627				 * Nothing has been pushed or popped.
4628				 * Proceed to parse the next item.
4629				 */
4630				return $this->step();
4631
4632			/*
4633			 * > A start tag whose tag name is "noframes"
4634			 */
4635			case '+NOFRAMES':
4636				return $this->step_in_head();
4637		}
4638
4639		// Parse error: ignore the token.
4640		return $this->step();
4641	}
4642
4643	/**
4644	 * Parses next element in the 'after after body' insertion mode.
4645	 *
4646	 * This internal function performs the 'after after body' insertion mode
4647	 * logic for the generalized WP_HTML_Processor::step() function.
4648	 *
4649	 * @since 6.7.0 Stub implementation.
4650	 *
4651	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4652	 *
4653	 * @see https://html.spec.whatwg.org/#the-after-after-body-insertion-mode
4654	 * @see WP_HTML_Processor::step
4655	 *
4656	 * @return bool Whether an element was found.
4657	 */
4658	private function step_after_after_body(): bool {
4659		$tag_name   = $this->get_token_name();
4660		$token_type = $this->get_token_type();
4661		$op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4662		$op         = "{$op_sigil}{$tag_name}";
4663
4664		switch ( $op ) {
4665			/*
4666			 * > A comment token
4667			 */
4668			case '#comment':
4669			case '#funky-comment':
4670			case '#presumptuous-tag':
4671				$this->bail( 'Content outside of HTML is unsupported.' );
4672				break;
4673
4674			/*
4675			 * > A DOCTYPE token
4676			 * > A start tag whose tag name is "html"
4677			 *
4678			 * > Process the token using the rules for the "in body" insertion mode.
4679			 */
4680			case 'html':
4681			case '+HTML':
4682				return $this->step_in_body();
4683
4684			/*
4685			 * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4686			 * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4687			 * >
4688			 * > Process the token using the rules for the "in body" insertion mode.
4689			 */
4690			case '#text':
4691				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4692					return $this->step_in_body();
4693				}
4694				goto after_after_body_anything_else;
4695				break;
4696		}
4697
4698		/*
4699		 * > Parse error. Switch the insertion mode to "in body" and reprocess the token.
4700		 */
4701		after_after_body_anything_else:
4702		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
4703		return $this->step( self::REPROCESS_CURRENT_NODE );
4704	}
4705
4706	/**
4707	 * Parses next element in the 'after after frameset' insertion mode.
4708	 *
4709	 * This internal function performs the 'after after frameset' insertion mode
4710	 * logic for the generalized WP_HTML_Processor::step() function.
4711	 *
4712	 * @since 6.7.0 Stub implementation.
4713	 *
4714	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4715	 *
4716	 * @see https://html.spec.whatwg.org/#the-after-after-frameset-insertion-mode
4717	 * @see WP_HTML_Processor::step
4718	 *
4719	 * @return bool Whether an element was found.
4720	 */
4721	private function step_after_after_frameset(): bool {
4722		$tag_name   = $this->get_token_name();
4723		$token_type = $this->get_token_type();
4724		$op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4725		$op         = "{$op_sigil}{$tag_name}";
4726
4727		switch ( $op ) {
4728			/*
4729			 * > A comment token
4730			 */
4731			case '#comment':
4732			case '#funky-comment':
4733			case '#presumptuous-tag':
4734				$this->bail( 'Content outside of HTML is unsupported.' );
4735				break;
4736
4737			/*
4738			 * > A DOCTYPE token
4739			 * > A start tag whose tag name is "html"
4740			 *
4741			 * > Process the token using the rules for the "in body" insertion mode.
4742			 */
4743			case 'html':
4744			case '+HTML':
4745				return $this->step_in_body();
4746
4747			/*
4748			 * > A character token that is one of U+0009 CHARACTER TABULATION, U+000A LINE FEED (LF),
4749			 * >   U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), or U+0020 SPACE
4750			 * >
4751			 * > Process the token using the rules for the "in body" insertion mode.
4752			 *
4753			 * This algorithm effectively strips non-whitespace characters from text and inserts
4754			 * them under HTML. This is not supported at this time.
4755			 */
4756			case '#text':
4757				if ( parent::TEXT_IS_WHITESPACE === $this->text_node_classification ) {
4758					return $this->step_in_body();
4759				}
4760				$this->bail( 'Non-whitespace characters cannot be handled in after after frameset.' );
4761				break;
4762
4763			/*
4764			 * > A start tag whose tag name is "noframes"
4765			 */
4766			case '+NOFRAMES':
4767				return $this->step_in_head();
4768		}
4769
4770		// Parse error: ignore the token.
4771		return $this->step();
4772	}
4773
4774	/**
4775	 * Parses next element in the 'in foreign content' insertion mode.
4776	 *
4777	 * This internal function performs the 'in foreign content' insertion mode
4778	 * logic for the generalized WP_HTML_Processor::step() function.
4779	 *
4780	 * @since 6.7.0 Stub implementation.
4781	 *
4782	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
4783	 *
4784	 * @see https://html.spec.whatwg.org/#parsing-main-inforeign
4785	 * @see WP_HTML_Processor::step
4786	 *
4787	 * @return bool Whether an element was found.
4788	 */
4789	private function step_in_foreign_content(): bool {
4790		$tag_name   = $this->get_token_name();
4791		$token_type = $this->get_token_type();
4792		$op_sigil   = '#tag' === $token_type ? ( $this->is_tag_closer() ? '-' : '+' ) : '';
4793		$op         = "{$op_sigil}{$tag_name}";
4794
4795		/*
4796		 * > A start tag whose name is "font", if the token has any attributes named "color", "face", or "size"
4797		 *
4798		 * This section drawn out above the switch to more easily incorporate
4799		 * the additional rules based on the presence of the attributes.
4800		 */
4801		if (
4802			'+FONT' === $op &&
4803			(
4804				null !== $this->get_attribute( 'color' ) ||
4805				null !== $this->get_attribute( 'face' ) ||
4806				null !== $this->get_attribute( 'size' )
4807			)
4808		) {
4809			$op = '+FONT with attributes';
4810		}
4811
4812		switch ( $op ) {
4813			case '#text':
4814				/*
4815				 * > A character token that is U+0000 NULL
4816				 *
4817				 * This is handled by `get_modifiable_text()`.
4818				 */
4819
4820				/*
4821				 * Whitespace-only text does not affect the frameset-ok flag.
4822				 * It is probably inter-element whitespace, but it may also
4823				 * contain character references which decode only to whitespace.
4824				 */
4825				if ( parent::TEXT_IS_GENERIC === $this->text_node_classification ) {
4826					$this->state->frameset_ok = false;
4827				}
4828
4829				$this->insert_foreign_element( $this->state->current_token, false );
4830				return true;
4831
4832			/*
4833			 * CDATA sections are alternate wrappers for text content and therefore
4834			 * ought to follow the same rules as text nodes.
4835			 */
4836			case '#cdata-section':
4837				/*
4838				 * NULL bytes and whitespace do not change the frameset-ok flag.
4839				 */
4840				$current_token        = $this->bookmarks[ $this->state->current_token->bookmark_name ];
4841				$cdata_content_start  = $current_token->start + 9;
4842				$cdata_content_length = $current_token->length - 12;
4843				if ( strspn( $this->html, "\0 \t\n\f\r", $cdata_content_start, $cdata_content_length ) !== $cdata_content_length ) {
4844					$this->state->frameset_ok = false;
4845				}
4846
4847				$this->insert_foreign_element( $this->state->current_token, false );
4848				return true;
4849
4850			/*
4851			 * > A comment token
4852			 */
4853			case '#comment':
4854			case '#funky-comment':
4855			case '#presumptuous-tag':
4856				$this->insert_foreign_element( $this->state->current_token, false );
4857				return true;
4858
4859			/*
4860			 * > A DOCTYPE token
4861			 */
4862			case 'html':
4863				// Parse error: ignore the token.
4864				return $this->step();
4865
4866			/*
4867			 * > A start tag whose tag name is "b", "big", "blockquote", "body", "br", "center",
4868			 * > "code", "dd", "div", "dl", "dt", "em", "embed", "h1", "h2", "h3", "h4", "h5",
4869			 * > "h6", "head", "hr", "i", "img", "li", "listing", "menu", "meta", "nobr", "ol",
4870			 * > "p", "pre", "ruby", "s", "small", "span", "strong", "strike", "sub", "sup",
4871			 * > "table", "tt", "u", "ul", "var"
4872			 *
4873			 * > A start tag whose name is "font", if the token has any attributes named "color", "face", or "size"
4874			 *
4875			 * > An end tag whose tag name is "br", "p"
4876			 *
4877			 * Closing BR tags are always reported by the Tag Processor as opening tags.
4878			 */
4879			case '+B':
4880			case '+BIG':
4881			case '+BLOCKQUOTE':
4882			case '+BODY':
4883			case '+BR':
4884			case '+CENTER':
4885			case '+CODE':
4886			case '+DD':
4887			case '+DIV':
4888			case '+DL':
4889			case '+DT':
4890			case '+EM':
4891			case '+EMBED':
4892			case '+H1':
4893			case '+H2':
4894			case '+H3':
4895			case '+H4':
4896			case '+H5':
4897			case '+H6':
4898			case '+HEAD':
4899			case '+HR':
4900			case '+I':
4901			case '+IMG':
4902			case '+LI':
4903			case '+LISTING':
4904			case '+MENU':
4905			case '+META':
4906			case '+NOBR':
4907			case '+OL':
4908			case '+P':
4909			case '+PRE':
4910			case '+RUBY':
4911			case '+S':
4912			case '+SMALL':
4913			case '+SPAN':
4914			case '+STRONG':
4915			case '+STRIKE':
4916			case '+SUB':
4917			case '+SUP':
4918			case '+TABLE':
4919			case '+TT':
4920			case '+U':
4921			case '+UL':
4922			case '+VAR':
4923			case '+FONT with attributes':
4924			case '-BR':
4925			case '-P':
4926				// @todo Indicate a parse error once it's possible.
4927				foreach ( $this->state->stack_of_open_elements->walk_up() as $current_node ) {
4928					if (
4929						'math' === $current_node->integration_node_type ||
4930						'html' === $current_node->integration_node_type ||
4931						'html' === $current_node->namespace
4932					) {
4933						break;
4934					}
4935
4936					$this->state->stack_of_open_elements->pop();
4937				}
4938				goto in_foreign_content_process_in_current_insertion_mode;
4939		}
4940
4941		/*
4942		 * > Any other start tag
4943		 */
4944		if ( ! $this->is_tag_closer() ) {
4945			$this->insert_foreign_element( $this->state->current_token, false );
4946
4947			/*
4948			 * > If the token has its self-closing flag set, then run
4949			 * > the appropriate steps from the following list:
4950			 * >
4951			 * >   ↪ the token's tag name is "script", and the new current node is in the SVG namespace
4952			 * >         Acknowledge the token's self-closing flag, and then act as
4953			 * >         described in the steps for a "script" end tag below.
4954			 * >
4955			 * >   ↪ Otherwise
4956			 * >         Pop the current node off the stack of open elements and
4957			 * >         acknowledge the token's self-closing flag.
4958			 *
4959			 * Since the rules for SCRIPT below indicate to pop the element off of the stack of
4960			 * open elements, which is the same for the Otherwise condition, there's no need to
4961			 * separate these checks. The difference comes when a parser operates with the scripting
4962			 * flag enabled, and executes the script, which this parser does not support.
4963			 */
4964			if ( $this->state->current_token->has_self_closing_flag ) {
4965				$this->state->stack_of_open_elements->pop();
4966			}
4967			return true;
4968		}
4969
4970		/*
4971		 * > An end tag whose name is "script", if the current node is an SVG script element.
4972		 */
4973		if ( $this->is_tag_closer() && 'SCRIPT' === $this->state->current_token->node_name && 'svg' === $this->state->current_token->namespace ) {
4974			$this->state->stack_of_open_elements->pop();
4975			return true;
4976		}
4977
4978		/*
4979		 * > Any other end tag
4980		 */
4981		if ( $this->is_tag_closer() ) {
4982			$node = $this->state->stack_of_open_elements->current_node();
4983			if ( $tag_name !== $node->node_name ) {
4984				// @todo Indicate a parse error once it's possible.
4985			}
4986			in_foreign_content_end_tag_loop:
4987			if ( $node === $this->state->stack_of_open_elements->at( 1 ) ) {
4988				return true;
4989			}
4990
4991			/*
4992			 * > If node's tag name, converted to ASCII lowercase, is the same as the tag name
4993			 * > of the token, pop elements from the stack of open elements until node has
4994			 * > been popped from the stack, and then return.
4995			 */
4996			if ( 0 === strcasecmp( $node->node_name, $tag_name ) ) {
4997				foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
4998					$this->state->stack_of_open_elements->pop();
4999					if ( $node === $item ) {
5000						return true;
5001					}
5002				}
5003			}
5004
5005			foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $item ) {
5006				$node = $item;
5007				break;
5008			}
5009
5010			if ( 'html' !== $node->namespace ) {
5011				goto in_foreign_content_end_tag_loop;
5012			}
5013
5014			in_foreign_content_process_in_current_insertion_mode:
5015			switch ( $this->state->insertion_mode ) {
5016				case WP_HTML_Processor_State::INSERTION_MODE_INITIAL:
5017					return $this->step_initial();
5018
5019				case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HTML:
5020					return $this->step_before_html();
5021
5022				case WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD:
5023					return $this->step_before_head();
5024
5025				case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD:
5026					return $this->step_in_head();
5027
5028				case WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD_NOSCRIPT:
5029					return $this->step_in_head_noscript();
5030
5031				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD:
5032					return $this->step_after_head();
5033
5034				case WP_HTML_Processor_State::INSERTION_MODE_IN_BODY:
5035					return $this->step_in_body();
5036
5037				case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE:
5038					return $this->step_in_table();
5039
5040				case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_TEXT:
5041					return $this->step_in_table_text();
5042
5043				case WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION:
5044					return $this->step_in_caption();
5045
5046				case WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP:
5047					return $this->step_in_column_group();
5048
5049				case WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY:
5050					return $this->step_in_table_body();
5051
5052				case WP_HTML_Processor_State::INSERTION_MODE_IN_ROW:
5053					return $this->step_in_row();
5054
5055				case WP_HTML_Processor_State::INSERTION_MODE_IN_CELL:
5056					return $this->step_in_cell();
5057
5058				case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT:
5059					return $this->step_in_select();
5060
5061				case WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE:
5062					return $this->step_in_select_in_table();
5063
5064				case WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE:
5065					return $this->step_in_template();
5066
5067				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_BODY:
5068					return $this->step_after_body();
5069
5070				case WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET:
5071					return $this->step_in_frameset();
5072
5073				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_FRAMESET:
5074					return $this->step_after_frameset();
5075
5076				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_BODY:
5077					return $this->step_after_after_body();
5078
5079				case WP_HTML_Processor_State::INSERTION_MODE_AFTER_AFTER_FRAMESET:
5080					return $this->step_after_after_frameset();
5081
5082				// This should be unreachable but PHP doesn't have total type checking on switch.
5083				default:
5084					$this->bail( "Unaware of the requested parsing mode: '{$this->state->insertion_mode}'." );
5085			}
5086		}
5087
5088		$this->bail( 'Should not have been able to reach end of IN FOREIGN CONTENT processing. Check HTML API code.' );
5089		// This unnecessary return prevents tools from inaccurately reporting type errors.
5090		return false;
5091	}
5092
5093	/*
5094	 * Internal helpers
5095	 */
5096
5097	/**
5098	 * Creates a new bookmark for the currently-matched token and returns the generated name.
5099	 *
5100	 * @since 6.4.0
5101	 * @since 6.5.0 Renamed from bookmark_tag() to bookmark_token().
5102	 *
5103	 * @throws Exception When unable to allocate requested bookmark.
5104	 *
5105	 * @return string|false Name of created bookmark, or false if unable to create.
5106	 */
5107	private function bookmark_token() {
5108		if ( ! parent::set_bookmark( ++$this->bookmark_counter ) ) {
5109			$this->last_error = self::ERROR_EXCEEDED_MAX_BOOKMARKS;
5110			throw new Exception( 'could not allocate bookmark' );
5111		}
5112
5113		return "{$this->bookmark_counter}";
5114	}
5115
5116	/*
5117	 * HTML semantic overrides for Tag Processor
5118	 */
5119
5120	/**
5121	 * Indicates the namespace of the current token, or "html" if there is none.
5122	 *
5123	 * @return string One of "html", "math", or "svg".
5124	 */
5125	public function get_namespace(): string {
5126		if ( ! isset( $this->current_element ) ) {
5127			return parent::get_namespace();
5128		}
5129
5130		return $this->current_element->token->namespace;
5131	}
5132
5133	/**
5134	 * Returns the uppercase name of the matched tag.
5135	 *
5136	 * The semantic rules for HTML specify that certain tags be reprocessed
5137	 * with a different tag name. Because of this, the tag name presented
5138	 * by the HTML Processor may differ from the one reported by the HTML
5139	 * Tag Processor, which doesn't apply these semantic rules.
5140	 *
5141	 * Example:
5142	 *
5143	 *     $processor = new WP_HTML_Tag_Processor( '<div class="test">Test</div>' );
5144	 *     $processor->next_tag() === true;
5145	 *     $processor->get_tag() === 'DIV';
5146	 *
5147	 *     $processor->next_tag() === false;
5148	 *     $processor->get_tag() === null;
5149	 *
5150	 * @since 6.4.0
5151	 *
5152	 * @return string|null Name of currently matched tag in input HTML, or `null` if none found.
5153	 */
5154	public function get_tag(): ?string {
5155		if ( null !== $this->last_error ) {
5156			return null;
5157		}
5158
5159		if ( $this->is_virtual() ) {
5160			return $this->current_element->token->node_name;
5161		}
5162
5163		$tag_name = parent::get_tag();
5164
5165		/*
5166		 * > A start tag whose tag name is "image"
5167		 * > Change the token's tag name to "img" and reprocess it. (Don't ask.)
5168		 */
5169		return ( 'IMAGE' === $tag_name && 'html' === $this->get_namespace() )
5170			? 'IMG'
5171			: $tag_name;
5172	}
5173
5174	/**
5175	 * Indicates if the currently matched tag contains the self-closing flag.
5176	 *
5177	 * No HTML elements ought to have the self-closing flag and for those, the self-closing
5178	 * flag will be ignored. For void elements this is benign because they "self close"
5179	 * automatically. For non-void HTML elements though problems will appear if someone
5180	 * intends to use a self-closing element in place of that element with an empty body.
5181	 * For HTML foreign elements and custom elements the self-closing flag determines if
5182	 * they self-close or not.
5183	 *
5184	 * This function does not determine if a tag is self-closing,
5185	 * but only if the self-closing flag is present in the syntax.
5186	 *
5187	 * @since 6.6.0 Subclassed for the HTML Processor.
5188	 *
5189	 * @return bool Whether the currently matched tag contains the self-closing flag.
5190	 */
5191	public function has_self_closing_flag(): bool {
5192		return $this->is_virtual() ? false : parent::has_self_closing_flag();
5193	}
5194
5195	/**
5196	 * Returns the node name represented by the token.
5197	 *
5198	 * This matches the DOM API value `nodeName`. Some values
5199	 * are static, such as `#text` for a text node, while others
5200	 * are dynamically generated from the token itself.
5201	 *
5202	 * Dynamic names:
5203	 *  - Uppercase tag name for tag matches.
5204	 *  - `html` for DOCTYPE declarations.
5205	 *
5206	 * Note that if the Tag Processor is not matched on a token
5207	 * then this function will return `null`, either because it
5208	 * hasn't yet found a token or because it reached the end
5209	 * of the document without matching a token.
5210	 *
5211	 * @since 6.6.0 Subclassed for the HTML Processor.
5212	 *
5213	 * @return string|null Name of the matched token.
5214	 */
5215	public function get_token_name(): ?string {
5216		return $this->is_virtual()
5217			? $this->current_element->token->node_name
5218			: parent::get_token_name();
5219	}
5220
5221	/**
5222	 * Indicates the kind of matched token, if any.
5223	 *
5224	 * This differs from `get_token_name()` in that it always
5225	 * returns a static string indicating the type, whereas
5226	 * `get_token_name()` may return values derived from the
5227	 * token itself, such as a tag name or processing
5228	 * instruction tag.
5229	 *
5230	 * Possible values:
5231	 *  - `#tag` when matched on a tag.
5232	 *  - `#text` when matched on a text node.
5233	 *  - `#cdata-section` when matched on a CDATA node.
5234	 *  - `#comment` when matched on a comment.
5235	 *  - `#doctype` when matched on a DOCTYPE declaration.
5236	 *  - `#presumptuous-tag` when matched on an empty tag closer.
5237	 *  - `#funky-comment` when matched on a funky comment.
5238	 *
5239	 * @since 6.6.0 Subclassed for the HTML Processor.
5240	 *
5241	 * @return string|null What kind of token is matched, or null.
5242	 */
5243	public function get_token_type(): ?string {
5244		if ( $this->is_virtual() ) {
5245			/*
5246			 * This logic comes from the Tag Processor.
5247			 *
5248			 * @todo It would be ideal not to repeat this here, but it's not clearly
5249			 *       better to allow passing a token name to `get_token_type()`.
5250			 */
5251			$node_name     = $this->current_element->token->node_name;
5252			$starting_char = $node_name[0];
5253			if ( 'A' <= $starting_char && 'Z' >= $starting_char ) {
5254				return '#tag';
5255			}
5256
5257			if ( 'html' === $node_name ) {
5258				return '#doctype';
5259			}
5260
5261			return $node_name;
5262		}
5263
5264		return parent::get_token_type();
5265	}
5266
5267	/**
5268	 * Returns the value of a requested attribute from a matched tag opener if that attribute exists.
5269	 *
5270	 * Example:
5271	 *
5272	 *     $p = WP_HTML_Processor::create_fragment( '<div enabled class="test" data-test-id="14">Test</div>' );
5273	 *     $p->next_token() === true;
5274	 *     $p->get_attribute( 'data-test-id' ) === '14';
5275	 *     $p->get_attribute( 'enabled' ) === true;
5276	 *     $p->get_attribute( 'aria-label' ) === null;
5277	 *
5278	 *     $p->next_tag() === false;
5279	 *     $p->get_attribute( 'class' ) === null;
5280	 *
5281	 * @since 6.6.0 Subclassed for HTML Processor.
5282	 *
5283	 * @param string $name Name of attribute whose value is requested.
5284	 * @return string|true|null Value of attribute or `null` if not available. Boolean attributes return `true`.
5285	 */
5286	public function get_attribute( $name ) {
5287		return $this->is_virtual() ? null : parent::get_attribute( $name );
5288	}
5289
5290	/**
5291	 * Updates or creates a new attribute on the currently matched tag with the passed value.
5292	 *
5293	 * This function handles all necessary HTML encoding. Provide normal, unescaped string values.
5294	 * The HTML API will encode the strings appropriately so that the browser will interpret them
5295	 * as the intended value.
5296	 *
5297	 * Example:
5298	 *
5299	 *     // Renders “Eggs & Milk” in a browser, encoded as `<abbr title="Eggs &amp; Milk">`.
5300	 *     $processor->set_attribute( 'title', 'Eggs & Milk' );
5301	 *
5302	 *     // Renders “Eggs &amp; Milk” in a browser, encoded as `<abbr title="Eggs &amp;amp; Milk">`.
5303	 *     $processor->set_attribute( 'title', 'Eggs &amp; Milk' );
5304	 *
5305	 *     // Renders `true` as `<abbr title>`.
5306	 *     $processor->set_attribute( 'title', true );
5307	 *
5308	 *     // Renders without the attribute for `false` as `<abbr>`.
5309	 *     $processor->set_attribute( 'title', false );
5310	 *
5311	 * Special handling is provided for boolean attribute values:
5312	 *  - When `true` is passed as the value, then only the attribute name is added to the tag.
5313	 *  - When `false` is passed, the attribute gets removed if it existed before.
5314	 *
5315	 * @since 6.6.0 Subclassed for the HTML Processor.
5316	 * @since 6.9.0 Escapes all character references instead of trying to avoid double-escaping.
5317	 *
5318	 * @param string      $name  The attribute name to target.
5319	 * @param string|bool $value The new attribute value.
5320	 * @return bool Whether an attribute value was set.
5321	 */
5322	public function set_attribute( $name, $value ): bool {
5323		return $this->is_virtual() ? false : parent::set_attribute( $name, $value );
5324	}
5325
5326	/**
5327	 * Remove an attribute from the currently-matched tag.
5328	 *
5329	 * @since 6.6.0 Subclassed for HTML Processor.
5330	 *
5331	 * @param string $name The attribute name to remove.
5332	 * @return bool Whether an attribute was removed.
5333	 */
5334	public function remove_attribute( $name ): bool {
5335		return $this->is_virtual() ? false : parent::remove_attribute( $name );
5336	}
5337
5338	/**
5339	 * Gets lowercase names of all attributes matching a given prefix in the current tag.
5340	 *
5341	 * Note that matching is case-insensitive. This is in accordance with the spec:
5342	 *
5343	 * > There must never be two or more attributes on
5344	 * > the same start tag whose names are an ASCII
5345	 * > case-insensitive match for each other.
5346	 *     - HTML 5 spec
5347	 *
5348	 * Example:
5349	 *
5350	 *     $p = new WP_HTML_Tag_Processor( '<div data-ENABLED class="test" DATA-test-id="14">Test</div>' );
5351	 *     $p->next_tag( array( 'class_name' => 'test' ) ) === true;
5352	 *     $p->get_attribute_names_with_prefix( 'data-' ) === array( 'data-enabled', 'data-test-id' );
5353	 *
5354	 *     $p->next_tag() === false;
5355	 *     $p->get_attribute_names_with_prefix( 'data-' ) === null;
5356	 *
5357	 * @since 6.6.0 Subclassed for the HTML Processor.
5358	 *
5359	 * @see https://html.spec.whatwg.org/multipage/syntax.html#attributes-2:ascii-case-insensitive
5360	 *
5361	 * @param string $prefix Prefix of requested attribute names.
5362	 * @return array|null List of attribute names, or `null` when no tag opener is matched.
5363	 */
5364	public function get_attribute_names_with_prefix( $prefix ): ?array {
5365		return $this->is_virtual() ? null : parent::get_attribute_names_with_prefix( $prefix );
5366	}
5367
5368	/**
5369	 * Adds a new class name to the currently matched tag.
5370	 *
5371	 * @since 6.6.0 Subclassed for the HTML Processor.
5372	 *
5373	 * @param string $class_name The class name to add.
5374	 * @return bool Whether the class was set to be added.
5375	 */
5376	public function add_class( $class_name ): bool {
5377		return $this->is_virtual() ? false : parent::add_class( $class_name );
5378	}
5379
5380	/**
5381	 * Removes a class name from the currently matched tag.
5382	 *
5383	 * @since 6.6.0 Subclassed for the HTML Processor.
5384	 *
5385	 * @param string $class_name The class name to remove.
5386	 * @return bool Whether the class was set to be removed.
5387	 */
5388	public function remove_class( $class_name ): bool {
5389		return $this->is_virtual() ? false : parent::remove_class( $class_name );
5390	}
5391
5392	/**
5393	 * Returns if a matched tag contains the given ASCII case-insensitive class name.
5394	 *
5395	 * @since 6.6.0 Subclassed for the HTML Processor.
5396	 *
5397	 * @todo When reconstructing active formatting elements with attributes, find a way
5398	 *       to indicate if the virtually-reconstructed formatting elements contain the
5399	 *       wanted class name.
5400	 *
5401	 * @param string $wanted_class Look for this CSS class name, ASCII case-insensitive.
5402	 * @return bool|null Whether the matched tag contains the given class name, or null if not matched.
5403	 */
5404	public function has_class( $wanted_class ): ?bool {
5405		return $this->is_virtual() ? null : parent::has_class( $wanted_class );
5406	}
5407
5408	/**
5409	 * Generator for a foreach loop to step through each class name for the matched tag.
5410	 *
5411	 * This generator function is designed to be used inside a "foreach" loop.
5412	 *
5413	 * Example:
5414	 *
5415	 *     $p = WP_HTML_Processor::create_fragment( "<div class='free &lt;egg&lt;\tlang-en'>" );
5416	 *     $p->next_tag();
5417	 *     foreach ( $p->class_list() as $class_name ) {
5418	 *         echo "{$class_name} ";
5419	 *     }
5420	 *     // Outputs: "free <egg> lang-en "
5421	 *
5422	 * @since 6.6.0 Subclassed for the HTML Processor.
5423	 */
5424	public function class_list() {
5425		return $this->is_virtual() ? null : parent::class_list();
5426	}
5427
5428	/**
5429	 * Returns the modifiable text for a matched token, or an empty string.
5430	 *
5431	 * Modifiable text is text content that may be read and changed without
5432	 * changing the HTML structure of the document around it. This includes
5433	 * the contents of `#text` nodes in the HTML as well as the inner
5434	 * contents of HTML comments, Processing Instructions, and others, even
5435	 * though these nodes aren't part of a parsed DOM tree. They also contain
5436	 * the contents of SCRIPT and STYLE tags, of TEXTAREA tags, and of any
5437	 * other section in an HTML document which cannot contain HTML markup (DATA).
5438	 *
5439	 * If a token has no modifiable text then an empty string is returned to
5440	 * avoid needless crashing or type errors. An empty string does not mean
5441	 * that a token has modifiable text, and a token with modifiable text may
5442	 * have an empty string (e.g. a comment with no contents).
5443	 *
5444	 * @since 6.6.0 Subclassed for the HTML Processor.
5445	 *
5446	 * @return string
5447	 */
5448	public function get_modifiable_text(): string {
5449		return $this->is_virtual() ? '' : parent::get_modifiable_text();
5450	}
5451
5452	/**
5453	 * Indicates what kind of comment produced the comment node.
5454	 *
5455	 * Because there are different kinds of HTML syntax which produce
5456	 * comments, the Tag Processor tracks and exposes this as a type
5457	 * for the comment. Nominally only regular HTML comments exist as
5458	 * they are commonly known, but a number of unrelated syntax errors
5459	 * also produce comments.
5460	 *
5461	 * @see self::COMMENT_AS_ABRUPTLY_CLOSED_COMMENT
5462	 * @see self::COMMENT_AS_CDATA_LOOKALIKE
5463	 * @see self::COMMENT_AS_INVALID_HTML
5464	 * @see self::COMMENT_AS_HTML_COMMENT
5465	 * @see self::COMMENT_AS_PI_NODE_LOOKALIKE
5466	 *
5467	 * @since 6.6.0 Subclassed for the HTML Processor.
5468	 *
5469	 * @return string|null
5470	 */
5471	public function get_comment_type(): ?string {
5472		return $this->is_virtual() ? null : parent::get_comment_type();
5473	}
5474
5475	/**
5476	 * Removes a bookmark that is no longer needed.
5477	 *
5478	 * Releasing a bookmark frees up the small
5479	 * performance overhead it requires.
5480	 *
5481	 * @since 6.4.0
5482	 *
5483	 * @param string $bookmark_name Name of the bookmark to remove.
5484	 * @return bool Whether the bookmark already existed before removal.
5485	 */
5486	public function release_bookmark( $bookmark_name ): bool {
5487		return parent::release_bookmark( "_{$bookmark_name}" );
5488	}
5489
5490	/**
5491	 * Moves the internal cursor in the HTML Processor to a given bookmark's location.
5492	 *
5493	 * Be careful! Seeking backwards to a previous location resets the parser to the
5494	 * start of the document and reparses the entire contents up until it finds the
5495	 * sought-after bookmarked location.
5496	 *
5497	 * In order to prevent accidental infinite loops, there's a
5498	 * maximum limit on the number of times seek() can be called.
5499	 *
5500	 * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document.
5501	 *
5502	 * @since 6.4.0
5503	 *
5504	 * @param string $bookmark_name Jump to the place in the document identified by this bookmark name.
5505	 * @return bool Whether the internal cursor was successfully moved to the bookmark's location.
5506	 */
5507	public function seek( $bookmark_name ): bool {
5508		// Flush any pending updates to the document before beginning.
5509		$this->get_updated_html();
5510
5511		$actual_bookmark_name = "_{$bookmark_name}";
5512		$processor_started_at = $this->state->current_token
5513			? $this->bookmarks[ $this->state->current_token->bookmark_name ]->start
5514			: 0;
5515		$bookmark_starts_at   = $this->bookmarks[ $actual_bookmark_name ]->start;
5516		$direction            = $bookmark_starts_at > $processor_started_at ? 'forward' : 'backward';
5517
5518		/*
5519		 * If seeking backwards, it's possible that the sought-after bookmark exists within an element
5520		 * which has been closed before the current cursor; in other words, it has already been removed
5521		 * from the stack of open elements. This means that it's insufficient to simply pop off elements
5522		 * from the stack of open elements which appear after the bookmarked location and then jump to
5523		 * that location, as the elements which were open before won't be re-opened.
5524		 *
5525		 * In order to maintain consistency, the HTML Processor rewinds to the start of the document
5526		 * and reparses everything until it finds the sought-after bookmark.
5527		 *
5528		 * There are potentially better ways to do this: cache the parser state for each bookmark and
5529		 * restore it when seeking; store an immutable and idempotent register of where elements open
5530		 * and close.
5531		 *
5532		 * If caching the parser state it will be essential to properly maintain the cached stack of
5533		 * open elements and active formatting elements when modifying the document. This could be a
5534		 * tedious and time-consuming process as well, and so for now will not be performed.
5535		 *
5536		 * It may be possible to track bookmarks for where elements open and close, and in doing so
5537		 * be able to quickly recalculate breadcrumbs for any element in the document. It may even
5538		 * be possible to remove the stack of open elements and compute it on the fly this way.
5539		 * If doing this, the parser would need to track the opening and closing locations for all
5540		 * tokens in the breadcrumb path for any and all bookmarks. By utilizing bookmarks themselves
5541		 * this list could be automatically maintained while modifying the document. Finding the
5542		 * breadcrumbs would then amount to traversing that list from the start until the token
5543		 * being inspected. Once an element closes, if there are no bookmarks pointing to locations
5544		 * within that element, then all of these locations may be forgotten to save on memory use
5545		 * and computation time.
5546		 */
5547		if ( 'backward' === $direction ) {
5548
5549			/*
5550			 * When moving backward, stateful stacks should be cleared.
5551			 */
5552			foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
5553				$this->state->stack_of_open_elements->remove_node( $item );
5554			}
5555
5556			foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
5557				$this->state->active_formatting_elements->remove_node( $item );
5558			}
5559
5560			/*
5561			 * **After** clearing stacks, more processor state can be reset.
5562			 * This must be done after clearing the stack because those stacks generate events that
5563			 * would appear on a subsequent call to `next_token()`.
5564			 */
5565			$this->state->frameset_ok                       = true;
5566			$this->state->stack_of_template_insertion_modes = array();
5567			$this->state->head_element                      = null;
5568			$this->state->form_element                      = null;
5569			$this->state->current_token                     = null;
5570			$this->current_element                          = null;
5571			$this->element_queue                            = array();
5572
5573			/*
5574			 * The absence of a context node indicates a full parse.
5575			 * The presence of a context node indicates a fragment parser.
5576			 */
5577			if ( null === $this->context_node ) {
5578				$this->change_parsing_namespace( 'html' );
5579				$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_INITIAL;
5580				$this->breadcrumbs           = array();
5581
5582				$this->bookmarks['initial'] = new WP_HTML_Span( 0, 0 );
5583				parent::seek( 'initial' );
5584				unset( $this->bookmarks['initial'] );
5585			} else {
5586
5587				/*
5588				 * Push the root-node (HTML) back onto the stack of open elements.
5589				 *
5590				 * Fragment parsers require this extra bit of setup.
5591				 * It's handled in full parsers by advancing the processor state.
5592				 */
5593				$this->state->stack_of_open_elements->push(
5594					new WP_HTML_Token(
5595						'root-node',
5596						'HTML',
5597						false
5598					)
5599				);
5600
5601				$this->change_parsing_namespace(
5602					$this->context_node->integration_node_type
5603						? 'html'
5604						: $this->context_node->namespace
5605				);
5606
5607				if ( 'TEMPLATE' === $this->context_node->node_name ) {
5608					$this->state->stack_of_template_insertion_modes[] = WP_HTML_Processor_State::INSERTION_MODE_IN_TEMPLATE;
5609				}
5610
5611				$this->reset_insertion_mode_appropriately();
5612				$this->breadcrumbs = array_slice( $this->breadcrumbs, 0, 2 );
5613				parent::seek( $this->context_node->bookmark_name );
5614			}
5615		}
5616
5617		/*
5618		 * Here, the processor moves forward through the document until it matches the bookmark.
5619		 * do-while is used here because the processor is expected to already be stopped on
5620		 * a token than may match the bookmarked location.
5621		 */
5622		do {
5623			/*
5624			 * The processor will stop on virtual tokens, but bookmarks may not be set on them.
5625			 * They should not be matched when seeking a bookmark, skip them.
5626			 */
5627			if ( $this->is_virtual() ) {
5628				continue;
5629			}
5630			if ( $bookmark_starts_at === $this->bookmarks[ $this->state->current_token->bookmark_name ]->start ) {
5631				return true;
5632			}
5633		} while ( $this->next_token() );
5634
5635		return false;
5636	}
5637
5638	/**
5639	 * Sets a bookmark in the HTML document.
5640	 *
5641	 * Bookmarks represent specific places or tokens in the HTML
5642	 * document, such as a tag opener or closer. When applying
5643	 * edits to a document, such as setting an attribute, the
5644	 * text offsets of that token may shift; the bookmark is
5645	 * kept updated with those shifts and remains stable unless
5646	 * the entire span of text in which the token sits is removed.
5647	 *
5648	 * Release bookmarks when they are no longer needed.
5649	 *
5650	 * Example:
5651	 *
5652	 *     <main><h2>Surprising fact you may not know!</h2></main>
5653	 *           ^  ^
5654	 *            \-|-- this `H2` opener bookmark tracks the token
5655	 *
5656	 *     <main class="clickbait"><h2>Surprising fact you may no…
5657	 *                             ^  ^
5658	 *                              \-|-- it shifts with edits
5659	 *
5660	 * Bookmarks provide the ability to seek to a previously-scanned
5661	 * place in the HTML document. This avoids the need to re-scan
5662	 * the entire document.
5663	 *
5664	 * Example:
5665	 *
5666	 *     <ul><li>One</li><li>Two</li><li>Three</li></ul>
5667	 *                                 ^^^^
5668	 *                                 want to note this last item
5669	 *
5670	 *     $p = new WP_HTML_Tag_Processor( $html );
5671	 *     $in_list = false;
5672	 *     while ( $p->next_tag( array( 'tag_closers' => $in_list ? 'visit' : 'skip' ) ) ) {
5673	 *         if ( 'UL' === $p->get_tag() ) {
5674	 *             if ( $p->is_tag_closer() ) {
5675	 *                 $in_list = false;
5676	 *                 $p->set_bookmark( 'resume' );
5677	 *                 if ( $p->seek( 'last-li' ) ) {
5678	 *                     $p->add_class( 'last-li' );
5679	 *                 }
5680	 *                 $p->seek( 'resume' );
5681	 *                 $p->release_bookmark( 'last-li' );
5682	 *                 $p->release_bookmark( 'resume' );
5683	 *             } else {
5684	 *                 $in_list = true;
5685	 *             }
5686	 *         }
5687	 *
5688	 *         if ( 'LI' === $p->get_tag() ) {
5689	 *             $p->set_bookmark( 'last-li' );
5690	 *         }
5691	 *     }
5692	 *
5693	 * Bookmarks intentionally hide the internal string offsets
5694	 * to which they refer. They are maintained internally as
5695	 * updates are applied to the HTML document and therefore
5696	 * retain their "position" - the location to which they
5697	 * originally pointed. The inability to use bookmarks with
5698	 * functions like `substr` is therefore intentional to guard
5699	 * against accidentally breaking the HTML.
5700	 *
5701	 * Because bookmarks allocate memory and require processing
5702	 * for every applied update, they are limited and require
5703	 * a name. They should not be created with programmatically-made
5704	 * names, such as "li_{$index}" with some loop. As a general
5705	 * rule they should only be created with string-literal names
5706	 * like "start-of-section" or "last-paragraph".
5707	 *
5708	 * Bookmarks are a powerful tool to enable complicated behavior.
5709	 * Consider double-checking that you need this tool if you are
5710	 * reaching for it, as inappropriate use could lead to broken
5711	 * HTML structure or unwanted processing overhead.
5712	 *
5713	 * Bookmarks cannot be set on tokens that do no appear in the original
5714	 * HTML text. For example, the HTML `<table><td>` stops at tags `TABLE`,
5715	 * `TBODY`, `TR`, and `TD`. The `TBODY` and `TR` tags do not appear in
5716	 * the original HTML and cannot be used as bookmarks.
5717	 *
5718	 * @since 6.4.0
5719	 *
5720	 * @param string $bookmark_name Identifies this particular bookmark.
5721	 * @return bool Whether the bookmark was successfully created.
5722	 */
5723	public function set_bookmark( $bookmark_name ): bool {
5724		if ( $this->is_virtual() ) {
5725			_doing_it_wrong(
5726				__METHOD__,
5727				__( 'Cannot set bookmarks on tokens that do no appear in the original HTML text.' ),
5728				'6.8.0'
5729			);
5730			return false;
5731		}
5732		return parent::set_bookmark( "_{$bookmark_name}" );
5733	}
5734
5735	/**
5736	 * Checks whether a bookmark with the given name exists.
5737	 *
5738	 * @since 6.5.0
5739	 *
5740	 * @param string $bookmark_name Name to identify a bookmark that potentially exists.
5741	 * @return bool Whether that bookmark exists.
5742	 */
5743	public function has_bookmark( $bookmark_name ): bool {
5744		return parent::has_bookmark( "_{$bookmark_name}" );
5745	}
5746
5747	/*
5748	 * HTML Parsing Algorithms
5749	 */
5750
5751	/**
5752	 * Closes a P element.
5753	 *
5754	 * @since 6.4.0
5755	 *
5756	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
5757	 *
5758	 * @see https://html.spec.whatwg.org/#close-a-p-element
5759	 */
5760	private function close_a_p_element(): void {
5761		$this->generate_implied_end_tags( 'P' );
5762		$this->state->stack_of_open_elements->pop_until( 'P' );
5763	}
5764
5765	/**
5766	 * Closes elements that have implied end tags.
5767	 *
5768	 * @since 6.4.0
5769	 * @since 6.7.0 Full spec support.
5770	 *
5771	 * @see https://html.spec.whatwg.org/#generate-implied-end-tags
5772	 *
5773	 * @param string|null $except_for_this_element Perform as if this element doesn't exist in the stack of open elements.
5774	 */
5775	private function generate_implied_end_tags( ?string $except_for_this_element = null ): void {
5776		$elements_with_implied_end_tags = array(
5777			'DD',
5778			'DT',
5779			'LI',
5780			'OPTGROUP',
5781			'OPTION',
5782			'P',
5783			'RB',
5784			'RP',
5785			'RT',
5786			'RTC',
5787		);
5788
5789		$no_exclusions = ! isset( $except_for_this_element );
5790
5791		while (
5792			( $no_exclusions || ! $this->state->stack_of_open_elements->current_node_is( $except_for_this_element ) ) &&
5793			in_array( $this->state->stack_of_open_elements->current_node()->node_name, $elements_with_implied_end_tags, true )
5794		) {
5795			$this->state->stack_of_open_elements->pop();
5796		}
5797	}
5798
5799	/**
5800	 * Closes elements that have implied end tags, thoroughly.
5801	 *
5802	 * See the HTML specification for an explanation why this is
5803	 * different from generating end tags in the normal sense.
5804	 *
5805	 * @since 6.4.0
5806	 * @since 6.7.0 Full spec support.
5807	 *
5808	 * @see WP_HTML_Processor::generate_implied_end_tags
5809	 * @see https://html.spec.whatwg.org/#generate-implied-end-tags
5810	 */
5811	private function generate_implied_end_tags_thoroughly(): void {
5812		$elements_with_implied_end_tags = array(
5813			'CAPTION',
5814			'COLGROUP',
5815			'DD',
5816			'DT',
5817			'LI',
5818			'OPTGROUP',
5819			'OPTION',
5820			'P',
5821			'RB',
5822			'RP',
5823			'RT',
5824			'RTC',
5825			'TBODY',
5826			'TD',
5827			'TFOOT',
5828			'TH',
5829			'THEAD',
5830			'TR',
5831		);
5832
5833		while ( in_array( $this->state->stack_of_open_elements->current_node()->node_name, $elements_with_implied_end_tags, true ) ) {
5834			$this->state->stack_of_open_elements->pop();
5835		}
5836	}
5837
5838	/**
5839	 * Returns the adjusted current node.
5840	 *
5841	 * > The adjusted current node is the context element if the parser was created as
5842	 * > part of the HTML fragment parsing algorithm and the stack of open elements
5843	 * > has only one element in it (fragment case); otherwise, the adjusted current
5844	 * > node is the current node.
5845	 *
5846	 * @see https://html.spec.whatwg.org/#adjusted-current-node
5847	 *
5848	 * @since 6.7.0
5849	 *
5850	 * @return WP_HTML_Token|null The adjusted current node.
5851	 */
5852	private function get_adjusted_current_node(): ?WP_HTML_Token {
5853		if ( isset( $this->context_node ) && 1 === $this->state->stack_of_open_elements->count() ) {
5854			return $this->context_node;
5855		}
5856
5857		return $this->state->stack_of_open_elements->current_node();
5858	}
5859
5860	/**
5861	 * Reconstructs the active formatting elements.
5862	 *
5863	 * > This has the effect of reopening all the formatting elements that were opened
5864	 * > in the current body, cell, or caption (whichever is youngest) that haven't
5865	 * > been explicitly closed.
5866	 *
5867	 * @since 6.4.0
5868	 *
5869	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
5870	 *
5871	 * @see https://html.spec.whatwg.org/#reconstruct-the-active-formatting-elements
5872	 *
5873	 * @return bool Whether any formatting elements needed to be reconstructed.
5874	 */
5875	private function reconstruct_active_formatting_elements(): bool {
5876		/*
5877		 * > If there are no entries in the list of active formatting elements, then there is nothing
5878		 * > to reconstruct; stop this algorithm.
5879		 */
5880		if ( 0 === $this->state->active_formatting_elements->count() ) {
5881			return false;
5882		}
5883
5884		$last_entry = $this->state->active_formatting_elements->current_node();
5885		if (
5886
5887			/*
5888			 * > If the last (most recently added) entry in the list of active formatting elements is a marker;
5889			 * > stop this algorithm.
5890			 */
5891			'marker' === $last_entry->node_name ||
5892
5893			/*
5894			 * > If the last (most recently added) entry in the list of active formatting elements is an
5895			 * > element that is in the stack of open elements, then there is nothing to reconstruct;
5896			 * > stop this algorithm.
5897			 */
5898			$this->state->stack_of_open_elements->contains_node( $last_entry )
5899		) {
5900			return false;
5901		}
5902
5903		$this->bail( 'Cannot reconstruct active formatting elements when advancing and rewinding is required.' );
5904	}
5905
5906	/**
5907	 * Runs the reset the insertion mode appropriately algorithm.
5908	 *
5909	 * @since 6.7.0
5910	 *
5911	 * @see https://html.spec.whatwg.org/multipage/parsing.html#reset-the-insertion-mode-appropriately
5912	 */
5913	private function reset_insertion_mode_appropriately(): void {
5914		// Set the first node.
5915		$first_node = null;
5916		foreach ( $this->state->stack_of_open_elements->walk_down() as $first_node ) {
5917			break;
5918		}
5919
5920		/*
5921		 * > 1. Let _last_ be false.
5922		 */
5923		$last = false;
5924		foreach ( $this->state->stack_of_open_elements->walk_up() as $node ) {
5925			/*
5926			 * > 2. Let _node_ be the last node in the stack of open elements.
5927			 * > 3. _Loop_: If _node_ is the first node in the stack of open elements, then set _last_
5928			 * >            to true, and, if the parser was created as part of the HTML fragment parsing
5929			 * >            algorithm (fragment case), set node to the context element passed to
5930			 * >            that algorithm.
5931			 * > …
5932			 */
5933			if ( $node === $first_node ) {
5934				$last = true;
5935				if ( isset( $this->context_node ) ) {
5936					$node = $this->context_node;
5937				}
5938			}
5939
5940			// All of the following rules are for matching HTML elements.
5941			if ( 'html' !== $node->namespace ) {
5942				continue;
5943			}
5944
5945			switch ( $node->node_name ) {
5946				/*
5947				 * > 4. If node is a `select` element, run these substeps:
5948				 * >   1. If _last_ is true, jump to the step below labeled done.
5949				 * >   2. Let _ancestor_ be _node_.
5950				 * >   3. _Loop_: If _ancestor_ is the first node in the stack of open elements,
5951				 * >      jump to the step below labeled done.
5952				 * >   4. Let ancestor be the node before ancestor in the stack of open elements.
5953				 * >   …
5954				 * >   7. Jump back to the step labeled _loop_.
5955				 * >   8. _Done_: Switch the insertion mode to "in select" and return.
5956				 */
5957				case 'SELECT':
5958					if ( ! $last ) {
5959						foreach ( $this->state->stack_of_open_elements->walk_up( $node ) as $ancestor ) {
5960							if ( 'html' !== $ancestor->namespace ) {
5961								continue;
5962							}
5963
5964							switch ( $ancestor->node_name ) {
5965								/*
5966								 * > 5. If _ancestor_ is a `template` node, jump to the step below
5967								 * >    labeled _done_.
5968								 */
5969								case 'TEMPLATE':
5970									break 2;
5971
5972								/*
5973								 * > 6. If _ancestor_ is a `table` node, switch the insertion mode to
5974								 * >    "in select in table" and return.
5975								 */
5976								case 'TABLE':
5977									$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT_IN_TABLE;
5978									return;
5979							}
5980						}
5981					}
5982					$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_SELECT;
5983					return;
5984
5985				/*
5986				 * > 5. If _node_ is a `td` or `th` element and _last_ is false, then switch the
5987				 * >    insertion mode to "in cell" and return.
5988				 */
5989				case 'TD':
5990				case 'TH':
5991					if ( ! $last ) {
5992						$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CELL;
5993						return;
5994					}
5995					break;
5996
5997					/*
5998					* > 6. If _node_ is a `tr` element, then switch the insertion mode to "in row"
5999					* >    and return.
6000					*/
6001				case 'TR':
6002					$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
6003					return;
6004
6005				/*
6006				 * > 7. If _node_ is a `tbody`, `thead`, or `tfoot` element, then switch the
6007				 * >    insertion mode to "in table body" and return.
6008				 */
6009				case 'TBODY':
6010				case 'THEAD':
6011				case 'TFOOT':
6012					$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE_BODY;
6013					return;
6014
6015				/*
6016				 * > 8. If _node_ is a `caption` element, then switch the insertion mode to
6017				 * >    "in caption" and return.
6018				 */
6019				case 'CAPTION':
6020					$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_CAPTION;
6021					return;
6022
6023				/*
6024				 * > 9. If _node_ is a `colgroup` element, then switch the insertion mode to
6025				 * >    "in column group" and return.
6026				 */
6027				case 'COLGROUP':
6028					$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_COLUMN_GROUP;
6029					return;
6030
6031				/*
6032				 * > 10. If _node_ is a `table` element, then switch the insertion mode to
6033				 * >     "in table" and return.
6034				 */
6035				case 'TABLE':
6036					$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_TABLE;
6037					return;
6038
6039				/*
6040				 * > 11. If _node_ is a `template` element, then switch the insertion mode to the
6041				 * >     current template insertion mode and return.
6042				 */
6043				case 'TEMPLATE':
6044					$this->state->insertion_mode = end( $this->state->stack_of_template_insertion_modes );
6045					return;
6046
6047				/*
6048				 * > 12. If _node_ is a `head` element and _last_ is false, then switch the
6049				 * >     insertion mode to "in head" and return.
6050				 */
6051				case 'HEAD':
6052					if ( ! $last ) {
6053						$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_HEAD;
6054						return;
6055					}
6056					break;
6057
6058				/*
6059				 * > 13. If _node_ is a `body` element, then switch the insertion mode to "in body"
6060				 * >     and return.
6061				 */
6062				case 'BODY':
6063					$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
6064					return;
6065
6066				/*
6067				 * > 14. If _node_ is a `frameset` element, then switch the insertion mode to
6068				 * >     "in frameset" and return. (fragment case)
6069				 */
6070				case 'FRAMESET':
6071					$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_FRAMESET;
6072					return;
6073
6074				/*
6075				 * > 15. If _node_ is an `html` element, run these substeps:
6076				 * >     1. If the head element pointer is null, switch the insertion mode to
6077				 * >        "before head" and return. (fragment case)
6078				 * >     2. Otherwise, the head element pointer is not null, switch the insertion
6079				 * >        mode to "after head" and return.
6080				 */
6081				case 'HTML':
6082					$this->state->insertion_mode = isset( $this->state->head_element )
6083						? WP_HTML_Processor_State::INSERTION_MODE_AFTER_HEAD
6084						: WP_HTML_Processor_State::INSERTION_MODE_BEFORE_HEAD;
6085					return;
6086			}
6087		}
6088
6089		/*
6090		 * > 16. If _last_ is true, then switch the insertion mode to "in body"
6091		 * >     and return. (fragment case)
6092		 *
6093		 * This is only reachable if `$last` is true, as per the fragment parsing case.
6094		 */
6095		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_BODY;
6096	}
6097
6098	/**
6099	 * Runs the adoption agency algorithm.
6100	 *
6101	 * @since 6.4.0
6102	 *
6103	 * @throws WP_HTML_Unsupported_Exception When encountering unsupported HTML input.
6104	 *
6105	 * @see https://html.spec.whatwg.org/#adoption-agency-algorithm
6106	 */
6107	private function run_adoption_agency_algorithm(): void {
6108		$budget       = 1000;
6109		$subject      = $this->get_tag();
6110		$current_node = $this->state->stack_of_open_elements->current_node();
6111
6112		if (
6113			// > If the current node is an HTML element whose tag name is subject
6114			$current_node && $subject === $current_node->node_name &&
6115			// > the current node is not in the list of active formatting elements
6116			! $this->state->active_formatting_elements->contains_node( $current_node )
6117		) {
6118			$this->state->stack_of_open_elements->pop();
6119			return;
6120		}
6121
6122		$outer_loop_counter = 0;
6123		while ( $budget-- > 0 ) {
6124			if ( $outer_loop_counter++ >= 8 ) {
6125				return;
6126			}
6127
6128			/*
6129			 * > Let formatting element be the last element in the list of active formatting elements that:
6130			 * >   - is between the end of the list and the last marker in the list,
6131			 * >     if any, or the start of the list otherwise,
6132			 * >   - and has the tag name subject.
6133			 */
6134			$formatting_element = null;
6135			foreach ( $this->state->active_formatting_elements->walk_up() as $item ) {
6136				if ( 'marker' === $item->node_name ) {
6137					break;
6138				}
6139
6140				if ( $subject === $item->node_name ) {
6141					$formatting_element = $item;
6142					break;
6143				}
6144			}
6145
6146			// > If there is no such element, then return and instead act as described in the "any other end tag" entry above.
6147			if ( null === $formatting_element ) {
6148				$this->bail( 'Cannot run adoption agency when "any other end tag" is required.' );
6149			}
6150
6151			// > If formatting element is not in the stack of open elements, then this is a parse error; remove the element from the list, and return.
6152			if ( ! $this->state->stack_of_open_elements->contains_node( $formatting_element ) ) {
6153				$this->state->active_formatting_elements->remove_node( $formatting_element );
6154				return;
6155			}
6156
6157			// > If formatting element is in the stack of open elements, but the element is not in scope, then this is a parse error; return.
6158			if ( ! $this->state->stack_of_open_elements->has_element_in_scope( $formatting_element->node_name ) ) {
6159				return;
6160			}
6161
6162			/*
6163			 * > Let furthest block be the topmost node in the stack of open elements that is lower in the stack
6164			 * > than formatting element, and is an element in the special category. There might not be one.
6165			 */
6166			$is_above_formatting_element = true;
6167			$furthest_block              = null;
6168			foreach ( $this->state->stack_of_open_elements->walk_down() as $item ) {
6169				if ( $is_above_formatting_element && $formatting_element->bookmark_name !== $item->bookmark_name ) {
6170					continue;
6171				}
6172
6173				if ( $is_above_formatting_element ) {
6174					$is_above_formatting_element = false;
6175					continue;
6176				}
6177
6178				if ( self::is_special( $item ) ) {
6179					$furthest_block = $item;
6180					break;
6181				}
6182			}
6183
6184			/*
6185			 * > If there is no furthest block, then the UA must first pop all the nodes from the bottom of the
6186			 * > stack of open elements, from the current node up to and including formatting element, then
6187			 * > remove formatting element from the list of active formatting elements, and finally return.
6188			 */
6189			if ( null === $furthest_block ) {
6190				foreach ( $this->state->stack_of_open_elements->walk_up() as $item ) {
6191					$this->state->stack_of_open_elements->pop();
6192
6193					if ( $formatting_element->bookmark_name === $item->bookmark_name ) {
6194						$this->state->active_formatting_elements->remove_node( $formatting_element );
6195						return;
6196					}
6197				}
6198			}
6199
6200			$this->bail( 'Cannot extract common ancestor in adoption agency algorithm.' );
6201		}
6202
6203		$this->bail( 'Cannot run adoption agency when looping required.' );
6204	}
6205
6206	/**
6207	 * Runs the "close the cell" algorithm.
6208	 *
6209	 * > Where the steps above say to close the cell, they mean to run the following algorithm:
6210	 * >   1. Generate implied end tags.
6211	 * >   2. If the current node is not now a td element or a th element, then this is a parse error.
6212	 * >   3. Pop elements from the stack of open elements stack until a td element or a th element has been popped from the stack.
6213	 * >   4. Clear the list of active formatting elements up to the last marker.
6214	 * >   5. Switch the insertion mode to "in row".
6215	 *
6216	 * @see https://html.spec.whatwg.org/multipage/parsing.html#close-the-cell
6217	 *
6218	 * @since 6.7.0
6219	 */
6220	private function close_cell(): void {
6221		$this->generate_implied_end_tags();
6222		// @todo Parse error if the current node is a "td" or "th" element.
6223		foreach ( $this->state->stack_of_open_elements->walk_up() as $element ) {
6224			$this->state->stack_of_open_elements->pop();
6225			if ( 'TD' === $element->node_name || 'TH' === $element->node_name ) {
6226				break;
6227			}
6228		}
6229		$this->state->active_formatting_elements->clear_up_to_last_marker();
6230		$this->state->insertion_mode = WP_HTML_Processor_State::INSERTION_MODE_IN_ROW;
6231	}
6232
6233	/**
6234	 * Inserts an HTML element on the stack of open elements.
6235	 *
6236	 * @since 6.4.0
6237	 *
6238	 * @see https://html.spec.whatwg.org/#insert-a-foreign-element
6239	 *
6240	 * @param WP_HTML_Token $token Name of bookmark pointing to element in original input HTML.
6241	 */
6242	private function insert_html_element( WP_HTML_Token $token ): void {
6243		$this->state->stack_of_open_elements->push( $token );
6244	}
6245
6246	/**
6247	 * Inserts a foreign element on to the stack of open elements.
6248	 *
6249	 * @since 6.7.0
6250	 *
6251	 * @see https://html.spec.whatwg.org/#insert-a-foreign-element
6252	 *
6253	 * @param WP_HTML_Token $token                     Insert this token. The token's namespace and
6254	 *                                                 insertion point will be updated correctly.
6255	 * @param bool          $only_add_to_element_stack Whether to skip the "insert an element at the adjusted
6256	 *                                                 insertion location" algorithm when adding this element.
6257	 */
6258	private function insert_foreign_element( WP_HTML_Token $token, bool $only_add_to_element_stack ): void {
6259		$adjusted_current_node = $this->get_adjusted_current_node();
6260
6261		$token->namespace = $adjusted_current_node ? $adjusted_current_node->namespace : 'html';
6262
6263		if ( $this->is_mathml_integration_point() ) {
6264			$token->integration_node_type = 'math';
6265		} elseif ( $this->is_html_integration_point() ) {
6266			$token->integration_node_type = 'html';
6267		}
6268
6269		if ( false === $only_add_to_element_stack ) {
6270			/*
6271			 * @todo Implement the "appropriate place for inserting a node" and the
6272			 *       "insert an element at the adjusted insertion location" algorithms.
6273			 *
6274			 * These algorithms mostly impacts DOM tree construction and not the HTML API.
6275			 * Here, there's no DOM node onto which the element will be appended, so the
6276			 * parser will skip this step.
6277			 *
6278			 * @see https://html.spec.whatwg.org/#insert-an-element-at-the-adjusted-insertion-location
6279			 */
6280		}
6281
6282		$this->insert_html_element( $token );
6283	}
6284
6285	/**
6286	 * Inserts a virtual element on the stack of open elements.
6287	 *
6288	 * @since 6.7.0
6289	 *
6290	 * @param string      $token_name    Name of token to create and insert into the stack of open elements.
6291	 * @param string|null $bookmark_name Optional. Name to give bookmark for created virtual node.
6292	 *                                   Defaults to auto-creating a bookmark name.
6293	 * @return WP_HTML_Token Newly-created virtual token.
6294	 */
6295	private function insert_virtual_node( $token_name, $bookmark_name = null ): WP_HTML_Token {
6296		$here = $this->bookmarks[ $this->state->current_token->bookmark_name ];
6297		$name = $bookmark_name ?? $this->bookmark_token();
6298
6299		$this->bookmarks[ $name ] = new WP_HTML_Span( $here->start, 0 );
6300
6301		$token = new WP_HTML_Token( $name, $token_name, false );
6302		$this->insert_html_element( $token );
6303		return $token;
6304	}
6305
6306	/*
6307	 * HTML Specification Helpers
6308	 */
6309
6310	/**
6311	 * Indicates if the current token is a MathML integration point.
6312	 *
6313	 * @since 6.7.0
6314	 *
6315	 * @see https://html.spec.whatwg.org/#mathml-text-integration-point
6316	 *
6317	 * @return bool Whether the current token is a MathML integration point.
6318	 */
6319	private function is_mathml_integration_point(): bool {
6320		$current_token = $this->state->current_token;
6321		if ( ! isset( $current_token ) ) {
6322			return false;
6323		}
6324
6325		if ( 'math' !== $current_token->namespace || 'M' !== $current_token->node_name[0] ) {
6326			return false;
6327		}
6328
6329		$tag_name = $current_token->node_name;
6330
6331		return (
6332			'MI' === $tag_name ||
6333			'MO' === $tag_name ||
6334			'MN' === $tag_name ||
6335			'MS' === $tag_name ||
6336			'MTEXT' === $tag_name
6337		);
6338	}
6339
6340	/**
6341	 * Indicates if the current token is an HTML integration point.
6342	 *
6343	 * Note that this method must be an instance method with access
6344	 * to the current token, since it needs to examine the attributes
6345	 * of the currently-matched tag, if it's in the MathML namespace.
6346	 * Otherwise it would be required to scan the HTML and ensure that
6347	 * no other accounting is overlooked.
6348	 *
6349	 * @since 6.7.0
6350	 *
6351	 * @see https://html.spec.whatwg.org/#html-integration-point
6352	 *
6353	 * @return bool Whether the current token is an HTML integration point.
6354	 */
6355	private function is_html_integration_point(): bool {
6356		$current_token = $this->state->current_token;
6357		if ( ! isset( $current_token ) ) {
6358			return false;
6359		}
6360
6361		if ( 'html' === $current_token->namespace ) {
6362			return false;
6363		}
6364
6365		$tag_name = $current_token->node_name;
6366
6367		if ( 'svg' === $current_token->namespace ) {
6368			return (
6369				'DESC' === $tag_name ||
6370				'FOREIGNOBJECT' === $tag_name ||
6371				'TITLE' === $tag_name
6372			);
6373		}
6374
6375		if ( 'math' === $current_token->namespace ) {
6376			if ( 'ANNOTATION-XML' !== $tag_name ) {
6377				return false;
6378			}
6379
6380			$encoding = $this->get_attribute( 'encoding' );
6381
6382			return (
6383				is_string( $encoding ) &&
6384				(
6385					0 === strcasecmp( $encoding, 'application/xhtml+xml' ) ||
6386					0 === strcasecmp( $encoding, 'text/html' )
6387				)
6388			);
6389		}
6390
6391		$this->bail( 'Should not have reached end of HTML Integration Point detection: check HTML API code.' );
6392		// This unnecessary return prevents tools from inaccurately reporting type errors.
6393		return false;
6394	}
6395
6396	/**
6397	 * Returns whether an element of a given name is in the HTML special category.
6398	 *
6399	 * @since 6.4.0
6400	 *
6401	 * @see https://html.spec.whatwg.org/#special
6402	 *
6403	 * @param WP_HTML_Token|string $tag_name Node to check, or only its name if in the HTML namespace.
6404	 * @return bool Whether the element of the given name is in the special category.
6405	 */
6406	public static function is_special( $tag_name ): bool {
6407		if ( is_string( $tag_name ) ) {
6408			$tag_name = strtoupper( $tag_name );
6409		} else {
6410			$tag_name = 'html' === $tag_name->namespace
6411				? strtoupper( $tag_name->node_name )
6412				: "{$tag_name->namespace} {$tag_name->node_name}";
6413		}
6414
6415		return (
6416			'ADDRESS' === $tag_name ||
6417			'APPLET' === $tag_name ||
6418			'AREA' === $tag_name ||
6419			'ARTICLE' === $tag_name ||
6420			'ASIDE' === $tag_name ||
6421			'BASE' === $tag_name ||
6422			'BASEFONT' === $tag_name ||
6423			'BGSOUND' === $tag_name ||
6424			'BLOCKQUOTE' === $tag_name ||
6425			'BODY' === $tag_name ||
6426			'BR' === $tag_name ||
6427			'BUTTON' === $tag_name ||
6428			'CAPTION' === $tag_name ||
6429			'CENTER' === $tag_name ||
6430			'COL' === $tag_name ||
6431			'COLGROUP' === $tag_name ||
6432			'DD' === $tag_name ||
6433			'DETAILS' === $tag_name ||
6434			'DIR' === $tag_name ||
6435			'DIV' === $tag_name ||
6436			'DL' === $tag_name ||
6437			'DT' === $tag_name ||
6438			'EMBED' === $tag_name ||
6439			'FIELDSET' === $tag_name ||
6440			'FIGCAPTION' === $tag_name ||
6441			'FIGURE' === $tag_name ||
6442			'FOOTER' === $tag_name ||
6443			'FORM' === $tag_name ||
6444			'FRAME' === $tag_name ||
6445			'FRAMESET' === $tag_name ||
6446			'H1' === $tag_name ||
6447			'H2' === $tag_name ||
6448			'H3' === $tag_name ||
6449			'H4' === $tag_name ||
6450			'H5' === $tag_name ||
6451			'H6' === $tag_name ||
6452			'HEAD' === $tag_name ||
6453			'HEADER' === $tag_name ||
6454			'HGROUP' === $tag_name ||
6455			'HR' === $tag_name ||
6456			'HTML' === $tag_name ||
6457			'IFRAME' === $tag_name ||
6458			'IMG' === $tag_name ||
6459			'INPUT' === $tag_name ||
6460			'KEYGEN' === $tag_name ||
6461			'LI' === $tag_name ||
6462			'LINK' === $tag_name ||
6463			'LISTING' === $tag_name ||
6464			'MAIN' === $tag_name ||
6465			'MARQUEE' === $tag_name ||
6466			'MENU' === $tag_name ||
6467			'META' === $tag_name ||
6468			'NAV' === $tag_name ||
6469			'NOEMBED' === $tag_name ||
6470			'NOFRAMES' === $tag_name ||
6471			'NOSCRIPT' === $tag_name ||
6472			'OBJECT' === $tag_name ||
6473			'OL' === $tag_name ||
6474			'P' === $tag_name ||
6475			'PARAM' === $tag_name ||
6476			'PLAINTEXT' === $tag_name ||
6477			'PRE' === $tag_name ||
6478			'SCRIPT' === $tag_name ||
6479			'SEARCH' === $tag_name ||
6480			'SECTION' === $tag_name ||
6481			'SELECT' === $tag_name ||
6482			'SOURCE' === $tag_name ||
6483			'STYLE' === $tag_name ||
6484			'SUMMARY' === $tag_name ||
6485			'TABLE' === $tag_name ||
6486			'TBODY' === $tag_name ||
6487			'TD' === $tag_name ||
6488			'TEMPLATE' === $tag_name ||
6489			'TEXTAREA' === $tag_name ||
6490			'TFOOT' === $tag_name ||
6491			'TH' === $tag_name ||
6492			'THEAD' === $tag_name ||
6493			'TITLE' === $tag_name ||
6494			'TR' === $tag_name ||
6495			'TRACK' === $tag_name ||
6496			'UL' === $tag_name ||
6497			'WBR' === $tag_name ||
6498			'XMP' === $tag_name ||
6499
6500			// MathML.
6501			'math MI' === $tag_name ||
6502			'math MO' === $tag_name ||
6503			'math MN' === $tag_name ||
6504			'math MS' === $tag_name ||
6505			'math MTEXT' === $tag_name ||
6506			'math ANNOTATION-XML' === $tag_name ||
6507
6508			// SVG.
6509			'svg DESC' === $tag_name ||
6510			'svg FOREIGNOBJECT' === $tag_name ||
6511			'svg TITLE' === $tag_name
6512		);
6513	}
6514
6515	/**
6516	 * Returns whether a given element is an HTML Void Element
6517	 *
6518	 * > area, base, br, col, embed, hr, img, input, link, meta, source, track, wbr
6519	 *
6520	 * @since 6.4.0
6521	 *
6522	 * @see https://html.spec.whatwg.org/#void-elements
6523	 *
6524	 * @param string $tag_name Name of HTML tag to check.
6525	 * @return bool Whether the given tag is an HTML Void Element.
6526	 */
6527	public static function is_void( $tag_name ): bool {
6528		$tag_name = strtoupper( $tag_name );
6529
6530		return (
6531			'AREA' === $tag_name ||
6532			'BASE' === $tag_name ||
6533			'BASEFONT' === $tag_name || // Obsolete but still treated as void.
6534			'BGSOUND' === $tag_name || // Obsolete but still treated as void.
6535			'BR' === $tag_name ||
6536			'COL' === $tag_name ||
6537			'EMBED' === $tag_name ||
6538			'FRAME' === $tag_name ||
6539			'HR' === $tag_name ||
6540			'IMG' === $tag_name ||
6541			'INPUT' === $tag_name ||
6542			'KEYGEN' === $tag_name || // Obsolete but still treated as void.
6543			'LINK' === $tag_name ||
6544			'META' === $tag_name ||
6545			'PARAM' === $tag_name || // Obsolete but still treated as void.
6546			'SOURCE' === $tag_name ||
6547			'TRACK' === $tag_name ||
6548			'WBR' === $tag_name
6549		);
6550	}
6551
6552	/**
6553	 * Gets an encoding from a given string.
6554	 *
6555	 * This is an algorithm defined in the WHAT-WG specification.
6556	 *
6557	 * Example:
6558	 *
6559	 *     'UTF-8' === self::get_encoding( 'utf8' );
6560	 *     'UTF-8' === self::get_encoding( "  \tUTF-8 " );
6561	 *     null    === self::get_encoding( 'UTF-7' );
6562	 *     null    === self::get_encoding( 'utf8; charset=' );
6563	 *
6564	 * @see https://encoding.spec.whatwg.org/#concept-encoding-get
6565	 *
6566	 * @todo As this parser only supports UTF-8, only the UTF-8
6567	 *       encodings are detected. Add more as desired, but the
6568	 *       parser will bail on non-UTF-8 encodings.
6569	 *
6570	 * @since 6.7.0
6571	 *
6572	 * @param string $label A string which may specify a known encoding.
6573	 * @return string|null Known encoding if matched, otherwise null.
6574	 */
6575	protected static function get_encoding( string $label ): ?string {
6576		/*
6577		 * > Remove any leading and trailing ASCII whitespace from label.
6578		 */
6579		$label = trim( $label, " \t\f\r\n" );
6580
6581		/*
6582		 * > If label is an ASCII case-insensitive match for any of the labels listed in the
6583		 * > table below, then return the corresponding encoding; otherwise return failure.
6584		 */
6585		switch ( strtolower( $label ) ) {
6586			case 'unicode-1-1-utf-8':
6587			case 'unicode11utf8':
6588			case 'unicode20utf8':
6589			case 'utf-8':
6590			case 'utf8':
6591			case 'x-unicode20utf8':
6592				return 'UTF-8';
6593
6594			default:
6595				return null;
6596		}
6597	}
6598
6599	/*
6600	 * Constants that would pollute the top of the class if they were found there.
6601	 */
6602
6603	/**
6604	 * Indicates that the next HTML token should be parsed and processed.
6605	 *
6606	 * @since 6.4.0
6607	 *
6608	 * @var string
6609	 */
6610	const PROCESS_NEXT_NODE = 'process-next-node';
6611
6612	/**
6613	 * Indicates that the current HTML token should be reprocessed in the newly-selected insertion mode.
6614	 *
6615	 * @since 6.4.0
6616	 *
6617	 * @var string
6618	 */
6619	const REPROCESS_CURRENT_NODE = 'reprocess-current-node';
6620
6621	/**
6622	 * Indicates that the current HTML token should be processed without advancing the parser.
6623	 *
6624	 * @since 6.5.0
6625	 *
6626	 * @var string
6627	 */
6628	const PROCESS_CURRENT_NODE = 'process-current-node';
6629
6630	/**
6631	 * Indicates that the parser encountered unsupported markup and has bailed.
6632	 *
6633	 * @since 6.4.0
6634	 *
6635	 * @var string
6636	 */
6637	const ERROR_UNSUPPORTED = 'unsupported';
6638
6639	/**
6640	 * Indicates that the parser encountered more HTML tokens than it
6641	 * was able to process and has bailed.
6642	 *
6643	 * @since 6.4.0
6644	 *
6645	 * @var string
6646	 */
6647	const ERROR_EXCEEDED_MAX_BOOKMARKS = 'exceeded-max-bookmarks';
6648
6649	/**
6650	 * Unlock code that must be passed into the constructor to create this class.
6651	 *
6652	 * This class extends the WP_HTML_Tag_Processor, which has a public class
6653	 * constructor. Therefore, it's not possible to have a private constructor here.
6654	 *
6655	 * This unlock code is used to ensure that anyone calling the constructor is
6656	 * doing so with a full understanding that it's intended to be a private API.
6657	 *
6658	 * @access private
6659	 */
6660	const CONSTRUCTOR_UNLOCK_CODE = 'Use WP_HTML_Processor::create_fragment() instead of calling the class constructor directly.';
6661}
6662
Exploring Learning Landscapes in Academic