Mike Dalessio - Rails::HTML5: the strange and remarkable three-year journey - Rails World 2023

Here is your meta description: "Join Mike Dalessio as he shares the story of his three-year journey developing Rails::HTML5, exploring the complexities of HTML5 parsing and the challenges of building a reliable and secure HTML sanitizer."

Key takeaways
  • The HTML5 parser is complex and has many modes, making it hard to work with.
  • The Rails HTML sanitizer is designed to be a Swiss Army knife, handling various parsing contexts.
  • The parser should handle parsing correctly, and then sanitize the result.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The spec gives no guidance on error correction, making it hard to implement a parser.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle SVG and math tags correctly.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer uses a stack-based approach to parsing, making it efficient and flexible.
  • The parser should handle wildcard namespaces and provide a way to correct the markup.
  • The spec is a living standard, and browsers are evolving to implement it.
  • The Rails HTML sanitizer uses libxml2 and nokigiri to implement HTML5 parsing.
  • The parser should handle malformed HTML and provide a way to correct the markup.
  • The spec is ambiguous, and browsers may implement it differently.
  • The Rails HTML sanitizer uses a Mutation XSS approach to handle security vulnerabilities.
  • The parser should handle character data and string constants correctly.
  • The Rails HTML sanitizer