Documentation
¶
Overview ¶
Package shim provides a drop-in replacement for encoding/xml backed by the helium XML parser.
The API mirrors encoding/xml so that switching between the two requires only changing the import path. The underlying parser is helium's SAX-based parser, which provides stricter XML compliance and better performance for large documents.
Known Differences from encoding/xml ¶
The following behaviors differ from encoding/xml and are not expected to change:
- InnerXML serialization of empty elements: when unmarshaling a field tagged with ",innerxml", empty elements such as <T1></T1> are serialized as self-closed <T1/>. The helium DOM does not preserve the original serialization form of empty elements.
- Non-strict mode (Decoder.Strict = false) is not supported. The shim always parses in strict XML mode.
- The [HTMLAutoClose] variable and the Decoder.AutoClose field are not supported. The HTMLAutoClose variable is omitted entirely. The AutoClose field is present for signature compatibility but is a no-op.
- The deprecated encoding/xml.Escape function is omitted. Use EscapeText instead.
- Namespace strictness: undeclared namespace prefixes are rejected. encoding/xml silently accepts undeclared prefixes and places the raw prefix string in Name.Space.
- Attribute ordering: xmlns namespace declarations are emitted before regular attributes. Source-document attribute order is not preserved because the SAX parser delivers namespaces and attributes as separate slices.
- Decoder.InputOffset returns an approximate byte offset estimated from the serialized size of each token, not an exact count of bytes consumed from the input. It may diverge from encoding/xml for namespace-prefixed names, entity references, CDATA sections, and self-closing elements.
- Decoder.InputPos is based on a SAX locator snapshot taken at event time. Column numbers may differ from encoding/xml. During prolog token emission the reported position is (1, 1).
Examples ¶
Example code for this package lives in the examples/ directory at the repository root (files prefixed with shim_). Because examples are in a separate test module they do not appear in the generated documentation.
Index ¶
- Constants
- Variables
- func EscapeText(w io.Writer, s []byte) error
- func Marshal(v any) ([]byte, error)
- func MarshalIndent(v any, prefix, indent string) ([]byte, error)
- func Unmarshal(data []byte, v any) error
- type Attr
- type CharData
- type Comment
- type Decoder
- func (d *Decoder) Close()
- func (d *Decoder) Decode(v any) error
- func (d *Decoder) DecodeElement(v any, start *StartElement) error
- func (d *Decoder) InputOffset() int64
- func (d *Decoder) InputPos() (line, column int)
- func (d *Decoder) RawToken() (Token, error)
- func (d *Decoder) Skip() error
- func (d *Decoder) Token() (Token, error)
- type Directive
- type Encoder
- type EndElement
- type Marshaler
- type MarshalerAttr
- type Name
- type ProcInst
- type StartElement
- type SyntaxError
- type TagPathError
- type Token
- type TokenReader
- type UnmarshalError
- type Unmarshaler
- type UnmarshalerAttr
- type UnsupportedTypeError
Constants ¶
const Header = stdxml.Header
Variables ¶
var HTMLEntity = stdxml.HTMLEntity
Functions ¶
Types ¶
type Decoder ¶
type Decoder struct {
// Strict mode. When true (default), the parser requires strict XML conformance.
Strict bool
// AutoClose lists element names that should be auto-closed.
AutoClose []string
// Entity maps entity names to replacement text.
Entity map[string]string
// CharsetReader, if non-nil, defines a function to generate charset-conversion
// readers, converting from the provided charset into UTF-8.
CharsetReader func(charset string, input io.Reader) (io.Reader, error)
// DefaultSpace sets the default namespace for elements without an explicit namespace.
DefaultSpace string
// contains filtered or unexported fields
}
Decoder reads XML tokens from a stream. It is a drop-in replacement for encoding/xml.Decoder backed by helium's SAX parser.
func NewTokenDecoder ¶
func NewTokenDecoder(ctx context.Context, t TokenReader) *Decoder
func (*Decoder) Close ¶
func (d *Decoder) Close()
Close cancels the SAX goroutine and releases resources.
func (*Decoder) DecodeElement ¶
func (d *Decoder) DecodeElement(v any, start *StartElement) error
func (*Decoder) InputOffset ¶
type Encoder ¶
type Encoder struct {
// contains filtered or unexported fields
}
Encoder writes XML tokens to an output stream.
func NewEncoder ¶
NewEncoder returns a new encoder that writes to w.
func (*Encoder) EncodeElement ¶
func (enc *Encoder) EncodeElement(v any, start StartElement) error
EncodeElement writes the XML encoding of v to the stream, using start as the element tag.
func (*Encoder) EncodeToken ¶
EncodeToken writes the given XML token to the stream.
type EndElement ¶
type EndElement = stdxml.EndElement
type MarshalerAttr ¶
type MarshalerAttr = stdxml.MarshalerAttr
type StartElement ¶
type StartElement = stdxml.StartElement
type SyntaxError ¶
type SyntaxError = stdxml.SyntaxError
type TagPathError ¶
type TagPathError = stdxml.TagPathError
type TokenReader ¶
type TokenReader = stdxml.TokenReader
type UnmarshalError ¶
type UnmarshalError = stdxml.UnmarshalError
type Unmarshaler ¶
type Unmarshaler = stdxml.Unmarshaler
type UnmarshalerAttr ¶
type UnmarshalerAttr = stdxml.UnmarshalerAttr
type UnsupportedTypeError ¶
type UnsupportedTypeError = stdxml.UnsupportedTypeError