shim

package
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 29, 2026 License: MIT Imports: 18 Imported by: 0

README

shim

The shim package provides a drop-in replacement for Go's encoding/xml package backed by helium's parser.

Import path: github.com/lestrrat-go/helium/shim

It exposes the same core API surface as encoding/xml, including Marshal, Unmarshal, NewEncoder, NewDecoder, Token, EncodeToken, and the familiar struct tags such as xml:"name,attr", ,chardata, ,innerxml, and ,omitempty.

package examples_test

import (
  "fmt"

  "github.com/lestrrat-go/helium/shim"
)

func Example_shim_marshal() {
  // shim.Marshal works like encoding/xml.Marshal: it serializes a Go
  // struct into XML bytes using struct tags.
  type Person struct {
    XMLName shim.Name `xml:"person"`
    Name    string    `xml:"name"`
    Age     int       `xml:"age"`
  }

  p := Person{Name: "Alice", Age: 30}
  data, err := shim.Marshal(p)
  if err != nil {
    fmt.Printf("error: %s\n", err)
    return
  }
  fmt.Println(string(data))
  // Output:
  // <person><name>Alice</name><age>30</age></person>
}

source: examples/shim_marshal_example_test.go

Notes

  • Decoder.Strict = false is not supported.
  • HTMLAutoClose is omitted and Decoder.AutoClose is a no-op.
  • Undeclared namespace prefixes are rejected.
  • Namespace declarations are emitted before regular attributes.
  • InputOffset is approximate rather than exact.
  • Empty elements in ,innerxml may serialize as self-closed tags.

Documentation

Overview

Package shim provides a drop-in replacement for encoding/xml backed by the helium XML parser.

The API mirrors encoding/xml so that switching between the two requires only changing the import path. The underlying parser is helium's SAX-based parser, which provides stricter XML compliance and better performance for large documents.

Known Differences from encoding/xml

The following behaviors differ from encoding/xml and are not expected to change:

  • InnerXML serialization of empty elements: when unmarshaling a field tagged with ",innerxml", empty elements such as <T1></T1> are serialized as self-closed <T1/>. The helium DOM does not preserve the original serialization form of empty elements.
  • Non-strict mode (Decoder.Strict = false) is not supported. The shim always parses in strict XML mode.
  • The [HTMLAutoClose] variable and the Decoder.AutoClose field are not supported. The HTMLAutoClose variable is omitted entirely. The AutoClose field is present for signature compatibility but is a no-op.
  • The deprecated encoding/xml.Escape function is omitted. Use EscapeText instead.
  • Namespace strictness: undeclared namespace prefixes are rejected. encoding/xml silently accepts undeclared prefixes and places the raw prefix string in Name.Space.
  • Attribute ordering: xmlns namespace declarations are emitted before regular attributes. Source-document attribute order is not preserved because the SAX parser delivers namespaces and attributes as separate slices.
  • Decoder.InputOffset returns an approximate byte offset estimated from the serialized size of each token, not an exact count of bytes consumed from the input. It may diverge from encoding/xml for namespace-prefixed names, entity references, CDATA sections, and self-closing elements.
  • Decoder.InputPos is based on a SAX locator snapshot taken at event time. Column numbers may differ from encoding/xml. During prolog token emission the reported position is (1, 1).

Examples

Example code for this package lives in the examples/ directory at the repository root (files prefixed with shim_). Because examples are in a separate test module they do not appear in the generated documentation.

Index

Constants

View Source
const Header = stdxml.Header

Variables

View Source
var HTMLEntity = stdxml.HTMLEntity

Functions

func EscapeText

func EscapeText(w io.Writer, s []byte) error

func Marshal

func Marshal(v any) ([]byte, error)

func MarshalIndent

func MarshalIndent(v any, prefix, indent string) ([]byte, error)

func Unmarshal

func Unmarshal(data []byte, v any) error

Types

type Attr

type Attr = stdxml.Attr

type CharData

type CharData = stdxml.CharData

type Comment

type Comment = stdxml.Comment

type Decoder

type Decoder struct {
	// Strict mode. When true (default), the parser requires strict XML conformance.
	Strict bool

	// AutoClose lists element names that should be auto-closed.
	AutoClose []string

	// Entity maps entity names to replacement text.
	Entity map[string]string

	// CharsetReader, if non-nil, defines a function to generate charset-conversion
	// readers, converting from the provided charset into UTF-8.
	CharsetReader func(charset string, input io.Reader) (io.Reader, error)

	// DefaultSpace sets the default namespace for elements without an explicit namespace.
	DefaultSpace string
	// contains filtered or unexported fields
}

Decoder reads XML tokens from a stream. It is a drop-in replacement for encoding/xml.Decoder backed by helium's SAX parser.

func NewDecoder

func NewDecoder(ctx context.Context, r io.Reader) *Decoder

func NewTokenDecoder

func NewTokenDecoder(ctx context.Context, t TokenReader) *Decoder

func (*Decoder) Close

func (d *Decoder) Close()

Close cancels the SAX goroutine and releases resources.

func (*Decoder) Decode

func (d *Decoder) Decode(v any) error

func (*Decoder) DecodeElement

func (d *Decoder) DecodeElement(v any, start *StartElement) error

func (*Decoder) InputOffset

func (d *Decoder) InputOffset() int64

func (*Decoder) InputPos

func (d *Decoder) InputPos() (line, column int)

func (*Decoder) RawToken

func (d *Decoder) RawToken() (Token, error)

RawToken returns the next XML token without namespace resolution. Element names use prefix:local form instead of resolved namespace URIs.

func (*Decoder) Skip

func (d *Decoder) Skip() error

func (*Decoder) Token

func (d *Decoder) Token() (Token, error)

Token returns the next XML token in the input stream. Namespace URIs are resolved in the Name.Space field.

type Directive

type Directive = stdxml.Directive

type Encoder

type Encoder struct {
	// contains filtered or unexported fields
}

Encoder writes XML tokens to an output stream.

func NewEncoder

func NewEncoder(w io.Writer) *Encoder

NewEncoder returns a new encoder that writes to w.

func (*Encoder) Close

func (enc *Encoder) Close() error

Close flushes the encoder and returns an error if there are unclosed tags.

func (*Encoder) Encode

func (enc *Encoder) Encode(v any) error

Encode writes the XML encoding of v to the stream.

func (*Encoder) EncodeElement

func (enc *Encoder) EncodeElement(v any, start StartElement) error

EncodeElement writes the XML encoding of v to the stream, using start as the element tag.

func (*Encoder) EncodeToken

func (enc *Encoder) EncodeToken(t Token) error

EncodeToken writes the given XML token to the stream.

func (*Encoder) Flush

func (enc *Encoder) Flush() error

Flush flushes any buffered XML to the underlying writer.

func (*Encoder) Indent

func (enc *Encoder) Indent(prefix, indent string)

Indent sets the encoder to generate XML in which each element begins on a new indented line that starts with prefix and is followed by one or more copies of indent according to the nesting depth.

type EndElement

type EndElement = stdxml.EndElement

type Marshaler

type Marshaler = stdxml.Marshaler

type MarshalerAttr

type MarshalerAttr = stdxml.MarshalerAttr

type Name

type Name = stdxml.Name

type ProcInst

type ProcInst = stdxml.ProcInst

type StartElement

type StartElement = stdxml.StartElement

type SyntaxError

type SyntaxError = stdxml.SyntaxError

type TagPathError

type TagPathError = stdxml.TagPathError

type Token

type Token = stdxml.Token

func CopyToken

func CopyToken(t Token) Token

type TokenReader

type TokenReader = stdxml.TokenReader

type UnmarshalError

type UnmarshalError = stdxml.UnmarshalError

type Unmarshaler

type Unmarshaler = stdxml.Unmarshaler

type UnmarshalerAttr

type UnmarshalerAttr = stdxml.UnmarshalerAttr

type UnsupportedTypeError

type UnsupportedTypeError = stdxml.UnsupportedTypeError

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL