avro

package module
v1.3.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 30, 2026 License: MIT Imports: 22 Imported by: 1

README

avro

Go Reference

Encode and decode Avro binary data.

Parse an Avro JSON schema, then encode and decode Go values directly — no code generation required. Supports all primitive and complex types, logical types, schema evolution, Object Container Files, Single Object Encoding, and fingerprinting.

Index

Quick Start

package main

import (
	"fmt"
	"log"

	"github.com/twmb/avro"
)

var schema = avro.MustParse(`{
    "type": "record",
    "name": "User",
    "fields": [
        {"name": "name", "type": "string"},
        {"name": "age",  "type": "int"}
    ]
}`)

type User struct {
	Name string `avro:"name"`
	Age  int    `avro:"age"`
}

func main() {
	// Encode
	data, err := schema.Encode(&User{Name: "Alice", Age: 30})
	if err != nil {
		log.Fatal(err)
	}

	// Decode
	var u User
	_, err = schema.Decode(data, &u)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(u) // {Alice 30}
}

Parse accepts options: pass WithLaxNames() to allow non-standard characters in type and field names (useful for interop with schemas from other languages).

Type Mapping

The table below shows which Go types can be used with each Avro type.

Avro Type Encode Decode
null any (nil) any
boolean bool bool, any
int, long int, int8int64, uintuint64, float64, json.Number int, int8int64, uintuint64, any
float float32, float64, json.Number float32, float64, any
double float64, float32, json.Number float64, float32, any
string string, []byte, encoding.TextAppender, encoding.TextMarshaler string, []byte, encoding.TextUnmarshaler, any
bytes []byte, string []byte, string, any
enum string, any integer type (ordinal) string, any integer type (ordinal), any
fixed [N]byte, []byte [N]byte, []byte, any
array slice slice, any
map map[string]T map[string]T, any
union any, *T, or the matched branch type any, *T, or the matched branch type
record struct, map[string]any struct, map[string]any, any

When decoding into any, values use their natural Go types: nil, bool, int32, int64, float32, float64, string, []byte, []any, map[string]any. Logical types use time.Time (UTC) for timestamps and dates, time.Duration for time-of-day types, json.Number for decimals, and avro.Duration for the duration logical type.

Encoding also accepts json.Number for any numeric type (supporting json.Decoder.UseNumber() pipelines) and []byte for string fields (and vice versa).

Struct Tags

Struct fields are matched to Avro record fields by name. Use the avro struct tag to control the mapping:

type Example struct {
    Name    string  `avro:"name"`          // maps to Avro field "name"
    Ignored int     `avro:"-"`             // excluded from encoding/decoding
    Inner   Nested  `avro:",inline"`       // inline Nested's fields into this record
    Value   int     `avro:"val,omitzero"`  // encode zero value as Avro default
}

The tag format is:

avro:"[name][,option][,option]..."

The name portion maps the struct field to the Avro field with that name. If empty, the Go field name is used as-is. A tag of "-" excludes the field entirely.

Supported options:

  • inline: flatten a nested struct's fields into the parent record, as if they were declared directly on the parent. The field must be a struct or pointer to struct. This works like anonymous (embedded) struct fields, but for named fields. When using inline, the name portion of the tag must be empty.

  • omitzero: when encoding, if the field is the zero value for its type (or implements an IsZero() bool method that returns true), the Avro default value from the schema is used instead. This is useful for optional fields in ["null", T] unions or fields with explicit defaults.

Embedded (anonymous) struct fields are automatically inlined — their fields are promoted into the parent as if declared directly. To prevent inlining an embedded struct, give it an explicit name tag:

type Parent struct {
    Nested                    // inlined: Nested's fields are promoted
    Other  Aux `avro:"other"` // not inlined: treated as a single field
}

When multiple fields at different depths resolve to the same Avro field name, the shallowest field wins. Among fields at the same depth, a tagged field wins over an untagged one.

Schema Inference

SchemaFor infers an Avro schema from a Go struct type, using the same struct tags as encoding/decoding:

type User struct {
    Name      string     `avro:"name"`
    Age       int32      `avro:"age,default=18"`
    Email     *string    `avro:"email"`
    CreatedAt time.Time  `avro:"created_at"`
}

schema := avro.MustSchemaFor[User](avro.WithNamespace("com.example"))

This produces the equivalent of:

{
  "type": "record",
  "name": "User",
  "namespace": "com.example",
  "fields": [
    {"name": "name", "type": "string"},
    {"name": "age", "type": "int", "default": 18},
    {"name": "email", "type": ["null", "string"]},
    {"name": "created_at", "type": {"type": "long", "logicalType": "timestamp-millis"}}
  ]
}

Go types map to Avro types automatically: *T becomes a ["null", T] union, time.Time becomes timestamp-millis, and so on (see Type Mapping).

Additional tag options for schema inference:

Tag Example Description
default= avro:",default=0" Default value (must be last; scalars only)
alias= avro:",alias=old" Field alias for schema evolution (repeatable)
timestamp-micros avro:",timestamp-micros" Override logical type
decimal(p,s) avro:",decimal(10,2)" Decimal logical type (required for *big.Rat)
uuid avro:",uuid" UUID logical type
date avro:",date" Date logical type

Options:

  • WithNamespace(ns) sets the Avro namespace for the record.
  • WithName(name) overrides the record name (defaults to the Go struct name).

Schema Introspection

Schema.Root() returns a SchemaNode representing the parsed schema. This provides read access to all schema metadata including field types, logical types, doc strings, and custom properties:

schema, _ := avro.Parse(schemaJSON)
root := schema.Root()

for _, f := range root.Fields {
    fmt.Printf("field %s: type=%s\n", f.Name, f.Type.Type)
    if cn, ok := f.Props["connect.name"].(string); ok {
        fmt.Printf("  kafka connect type: %s\n", cn)
    }
}

SchemaNode can also be used to build schemas programmatically:

node := &avro.SchemaNode{
    Type: "record",
    Name: "User",
    Fields: []avro.SchemaField{
        {Name: "name", Type: avro.SchemaNode{Type: "string"}},
        {Name: "age", Type: avro.SchemaNode{Type: "int"}, Default: 18},
    },
}
schema, err := node.Schema()

Logical Types

Logical types decode to their natural Go equivalents:

Logical Type Avro Type Encode Decode
date int time.Time, RFC 3339 or YYYY-MM-DD string, or int time.Time (UTC)
time-millis int time.Duration or int time.Duration
time-micros long time.Duration or int time.Duration
timestamp-millis long time.Time, RFC 3339 string, or int time.Time (UTC)
timestamp-micros long time.Time, RFC 3339 string, or int time.Time (UTC)
timestamp-nanos long time.Time, RFC 3339 string, or int time.Time (UTC)
local-timestamp-millis long time.Time, RFC 3339 string, or int time.Time (UTC)
local-timestamp-micros long time.Time, RFC 3339 string, or int time.Time (UTC)
local-timestamp-nanos long time.Time, RFC 3339 string, or int time.Time (UTC)
uuid string or fixed(16) [16]byte or string [16]byte (typed target) or string (any target)
decimal bytes or fixed *big.Rat, float64, numeric string, json.Number, or underlying type *big.Rat, json.Number, or underlying type
duration fixed(12) avro.Duration or underlying type avro.Duration or underlying type

When encoding, timestamp and date fields accept RFC 3339 strings, and decimal fields accept float64 and numeric strings (e.g. "3.14"). Values that don't match the expected format fall through to the underlying type's encoder, which will return an error.

Unknown logical types are silently ignored per the Avro spec, and the underlying type is used as-is.

Schema Evolution

Avro data is always written with a specific schema — the writer schema. When you read that data later, your application may expect a different schema — the reader schema. You may have added a field, removed one, or widened a type from int to long.

Resolve bridges this gap. Given the writer and reader schemas, it returns a new schema that decodes data in the old wire format and produces values in the reader's layout:

  • Fields in the reader but not the writer are filled from defaults.
  • Fields in the writer but not the reader are skipped.
  • Fields that exist in both are matched by name (or alias) and decoded, with type promotion applied where needed (e.g. int → long).
Example

Suppose v1 of your application wrote User records with just a name:

var writerSchema = avro.MustParse(`{
    "type": "record", "name": "User",
    "fields": [
        {"name": "name", "type": "string"}
    ]
}`)

In v2 you added an email field with a default:

var readerSchema = avro.MustParse(`{
    "type": "record", "name": "User",
    "fields": [
        {"name": "name",  "type": "string"},
        {"name": "email", "type": "string", "default": ""}
    ]
}`)

type User struct {
    Name  string `avro:"name"`
    Email string `avro:"email"`
}

To read old v1 data with your v2 struct, resolve the two schemas:

resolved, err := avro.Resolve(writerSchema, readerSchema)

var u User
_, err = resolved.Decode(v1Data, &u)
// u == User{Name: "Alice", Email: ""}

The following type promotions are supported:

Writer → Reader
int → long, float, double
long → float, double
float → double
string ↔ bytes

CheckCompatibility checks whether two schemas are compatible without building a resolved schema. The direction you check depends on the guarantee you need:

// Backward: new schema can read old data.
avro.CheckCompatibility(oldSchema, newSchema)

// Forward: old schema can read new data.
avro.CheckCompatibility(newSchema, oldSchema)

// Full: check both directions.
avro.CheckCompatibility(oldSchema, newSchema)
avro.CheckCompatibility(newSchema, oldSchema)

Schema Cache

When working with a schema registry, schemas often reference types defined in other schemas. SchemaCache accumulates named types across multiple Parse calls so they can be resolved:

var cache avro.SchemaCache

// Parse referenced schema first — order matters.
_, err := cache.Parse(`{
    "type": "record",
    "name": "Address",
    "fields": [{"name": "city", "type": "string"}]
}`)

// Now parse a schema that references Address.
schema, err := cache.Parse(`{
    "type": "record",
    "name": "User",
    "fields": [
        {"name": "name",    "type": "string"},
        {"name": "address", "type": "Address"}
    ]
}`)

Parsing the same schema string multiple times returns the cached result, handling diamond dependencies without caller-side deduplication. The returned *Schema is independent of the cache and safe to use concurrently.

Custom Types

Register custom Go type conversions with NewCustomType for type-safe primitive conversions, or CustomType for advanced cases:

type Money struct {
    Cents    int64
    Currency string
}

moneyType := avro.NewCustomType[Money, int64]("money",
    func(m Money, _ *avro.SchemaNode) (int64, error) { return m.Cents, nil },
    func(c int64, _ *avro.SchemaNode) (Money, error) {
        return Money{Cents: c, Currency: "USD"}, nil
    },
)

schema := avro.MustParse(moneySchema, moneyType)

// Encode and decode — Money fields are automatically converted.
data, _ := schema.Encode(&order)
var out Order
schema.Decode(data, &out) // out.Price is Money{Cents: 500, ...}

// Works with SchemaFor too.
schema = avro.MustSchemaFor[Order](moneyType)

Custom types replace built-in logical type handling entirely — callbacks receive raw Avro-native values (int64 for long, int32 for int, etc.):

// Decode timestamps as raw int64 instead of time.Time.
schema := avro.MustParse(raw, avro.CustomType{
    LogicalType: "timestamp-millis",
    Decode: func(v any, _ *avro.SchemaNode) (any, error) {
        return v, nil // pass through raw int64
    },
})

For property-based dispatch (e.g., Kafka Connect / Debezium types), use an empty matching criteria with ErrSkipCustomType:

avro.CustomType{
    Decode: func(v any, node *avro.SchemaNode) (any, error) {
        name, _ := node.Props["connect.name"].(string)
        switch name {
        case "io.debezium.time.Timestamp":
            return time.UnixMilli(v.(int64)).UTC(), nil
        default:
            return nil, avro.ErrSkipCustomType
        }
    },
}

Object Container Files

The ocf sub-package reads and writes Avro Object Container Files — self-describing binary files that embed the schema in the header and store data in compressed blocks.

Writing
var schema = avro.MustParse(`{
    "type": "record",
    "name": "User",
    "fields": [
        {"name": "name", "type": "string"},
        {"name": "age",  "type": "int"}
    ]
}`)

f, _ := os.Create("users.avro")
w, err := ocf.NewWriter(f, schema, ocf.WithCodec(ocf.SnappyCodec()))
if err != nil {
    log.Fatal(err)
}
w.Encode(&User{Name: "Alice", Age: 30})
w.Encode(&User{Name: "Bob", Age: 25})
w.Close()
f.Close()
Reading
f, _ := os.Open("users.avro")
r, err := ocf.NewReader(f)
if err != nil {
    log.Fatal(err)
}
defer r.Close()
for {
    var u User
    err := r.Decode(&u)
    if err == io.EOF {
        break
    }
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(u)
}

The reader's Schema() method returns the schema parsed from the file header, which you can pass as the writer schema to Resolve.

Codecs

Built-in codecs: null (default, no compression), deflate (DeflateCodec), snappy (SnappyCodec), and zstandard (ZstdCodec). Custom codecs can be provided via the Codec interface.

Appending

NewAppendWriter opens an existing OCF for appending — it reads the header to recover the schema, codec, and sync marker, then seeks to the end.

JSON Encoding

EncodeJSON is a schema-aware JSON serializer. By default it produces standard JSON with bare union values and \uXXXX-encoded bytes:

// Standard JSON (default): bare unions
jsonBytes, err := schema.EncodeJSON(&user)
// {"name":"Alice","email":"[email protected]"}

// Avro JSON: unions wrapped as {"type_name": value}
jsonBytes, err = schema.EncodeJSON(&user, avro.TaggedUnions())
// {"name":"Alice","email":{"string":"[email protected]"}}

DecodeJSON accepts both formats (tagged and bare unions) and all NaN/Infinity conventions:

var user User
err = schema.DecodeJSON(jsonBytes, &user)

Decode and DecodeJSON also accept TaggedUnions() to wrap union values when decoding into *any:

var native any
schema.Decode(binary, &native, avro.TaggedUnions())
// native["email"] is map[string]any{"string": "[email protected]"}

Pass TagLogicalTypes() with TaggedUnions() to qualify union branch names with their logical type (e.g. "long.timestamp-millis" instead of "long"), matching the linkedin/goavro naming convention.

NaN and Infinity float values are encoded as "NaN", "Infinity", "-Infinity" strings by default (Java Avro convention). Pass LinkedinFloats() for the linkedin/goavro convention (null for NaN, ±1e999 for Infinity).

Single Object Encoding

For sending self-describing values over the wire (as opposed to files, where OCF is preferred), use Single Object Encoding. Each message is a 2-byte magic header, an 8-byte CRC-64-AVRO fingerprint, and the Avro binary payload.

// Encode with fingerprint header
data, err := schema.AppendSingleObject(nil, &user)

// Decode (schema known)
_, err = schema.DecodeSingleObject(data, &user)

// Decode (schema unknown): extract fingerprint, look up schema
fp, payload, err := avro.SingleObjectFingerprint(data)
schema := registry.Lookup(fp) // your schema registry
_, err = schema.Decode(payload, &user)

Fingerprinting

Canonical returns the Parsing Canonical Form of a schema — a deterministic JSON representation stripped of doc, aliases, defaults, and other non-essential attributes. Use it for schema comparison and fingerprinting.

canonical := schema.Canonical() // []byte

// CRC-64-AVRO (Rabin) — the Avro-standard fingerprint
fp := schema.Fingerprint(avro.NewRabin())

// SHA-256 — common for cross-language registries
fp256 := schema.Fingerprint(sha256.New())

Errors

Encode and decode errors can be inspected with errors.As:

  • *SemanticError: type mismatch between Go and Avro (includes a dotted field path for nested records, e.g. "address.zip").
  • *ShortBufferError: input truncated mid-value.
  • *CompatibilityError: schema evolution incompatibility (from Resolve or CheckCompatibility).

Performance

Struct field access uses unsafe pointer arithmetic (similar to encoding/json v2) to avoid reflect.Value overhead on every encode/decode. All schemas, type mappings, and codec state are cached after first use so repeated operations pay no extra allocation cost.

Documentation

Overview

Package avro encodes and decodes Avro specification data.

Parse an Avro JSON schema with Parse (or MustParse for package-level vars), then call Schema.Encode / Schema.Decode for binary encoding, or Schema.EncodeJSON / Schema.DecodeJSON for JSON encoding. Use SchemaFor to infer a schema from a Go struct type, or Schema.Root to inspect a parsed schema's structure.

Basic usage

schema := avro.MustParse(`{
    "type": "record",
    "name": "User",
    "fields": [
        {"name": "name", "type": "string"},
        {"name": "age",  "type": "int"}
    ]
}`)

type User struct {
    Name string `avro:"name"`
    Age  int    `avro:"age"`
}

// Encode
data, err := schema.Encode(&User{Name: "Alice", Age: 30})

// Decode
var u User
_, err = schema.Decode(data, &u)

JSON encoding

Schema.EncodeJSON is schema-aware and handles bytes, unions, and NaN/Infinity floats correctly — use it instead of encoding/json.Marshal when serializing decoded Avro data to JSON. Options control the output format: TaggedUnions for Avro JSON union wrappers ({"type": value}), TagLogicalTypes for qualified branch names, and LinkedinFloats for the goavro NaN/Infinity convention.

Encoding from JSON input

Data from encoding/json.Unmarshal (map[string]any with float64 numbers and string timestamps) can be encoded directly. Missing map keys are filled from schema defaults, encoding/json.Number is accepted for all numeric types, and timestamp fields accept RFC 3339 strings. String fields accept encoding.TextAppender and encoding.TextMarshaler implementations (with encoding.TextUnmarshaler on decode).

Schema evolution

Avro data is always written with a specific schema — the "writer schema." When you read that data later, your application may expect a different schema — the "reader schema." For example, you may have added a field, removed one, or widened a type from int to long. The data on disk doesn't change, but your code expects the new layout.

Resolve bridges this gap. Given the writer and reader schemas, it returns a new schema that knows how to decode the old wire format and produce values in the reader's layout:

  • Fields in the reader but not the writer are filled from defaults.
  • Fields in the writer but not the reader are skipped.
  • Fields that exist in both are matched by name (or alias) and decoded, with type promotion applied where needed (e.g. int → long).

You typically get the writer schema from the data itself: an OCF file header embeds it, and schema registries store it by ID or fingerprint.

As a concrete example, suppose v1 of your application wrote User records with just a name:

var writerSchema = avro.MustParse(`{
    "type": "record", "name": "User",
    "fields": [
        {"name": "name", "type": "string"}
    ]
}`)

In v2, you added an email field with a default:

var readerSchema = avro.MustParse(`{
    "type": "record", "name": "User",
    "fields": [
        {"name": "name",  "type": "string"},
        {"name": "email", "type": "string", "default": ""}
    ]
}`)

type User struct {
    Name  string `avro:"name"`
    Email string `avro:"email"`
}

To read old v1 data with your v2 struct, resolve the two schemas:

resolved, err := avro.Resolve(writerSchema, readerSchema)

// Decode v1 data: "email" is absent in the old data, so it gets
// the reader default ("").
var u User
_, err = resolved.Decode(v1Data, &u)
// u == User{Name: "Alice", Email: ""}

If you just want to check whether two schemas are compatible without building a resolved schema, use CheckCompatibility.

Struct tags

Use the "avro" struct tag to control field mapping and schema inference. The format is avro:"[name][,option]..." where the name maps the Go field to the Avro field name (empty = use Go field name, "-" = exclude).

Encoding/decoding options:

avro:"name"           // map to Avro field "name"
avro:"-"              // exclude field
avro:",inline"        // flatten nested struct fields into parent record
avro:",omitzero"      // encode zero values as the schema default

Schema inference options (used by SchemaFor):

avro:",default=value"         // set field default (must be last option; scalars only)
avro:",alias=old_name"        // field alias for evolution (repeatable)
avro:",timestamp-micros"      // override logical type (also: timestamp-nanos, date, time-millis, time-micros)
avro:",decimal(10,2)"         // decimal logical type with precision and scale
avro:",uuid"                  // UUID logical type

When encoding a map[string]any as a record, missing keys are filled from the schema's default values. For structs, omitzero does the same for zero-valued fields (or fields whose IsZero() method returns true).

Embedded (anonymous) struct fields are automatically inlined. To prevent inlining, give the field an explicit name tag. When multiple fields at different depths resolve to the same name, the shallowest wins; among fields at the same depth, a tagged field wins over an untagged one.

Custom types

CustomType registers custom Go type conversions for logical types, domain types, or to replace built-in behavior. A matching custom type replaces the built-in logical type handler entirely — callbacks receive raw Avro-native values, not enriched types. Use NewCustomType for type-safe primitive conversions, or the CustomType struct directly for complex cases (records, fixed types, property-based dispatch). Custom types are registered per-schema via SchemaOpt.

Parsing options

Parse and SchemaCache.Parse accept WithLaxNames to allow non-standard characters in type and field names.

Errors

Encode and decode errors can be inspected with errors.As:

Other features

Example
package main

import (
	"fmt"
	"log"

	"github.com/twmb/avro"
)

func main() {
	schema := avro.MustParse(`{
		"type": "record",
		"name": "User",
		"fields": [
			{"name": "name", "type": "string"},
			{"name": "age",  "type": "int"}
		]
	}`)

	type User struct {
		Name string `avro:"name"`
		Age  int32  `avro:"age"`
	}

	data, err := schema.Encode(&User{Name: "Alice", Age: 30})
	if err != nil {
		log.Fatal(err)
	}

	var u User
	if _, err := schema.Decode(data, &u); err != nil {
		log.Fatal(err)
	}
	fmt.Printf("%s is %d\n", u.Name, u.Age)
}
Output:
Alice is 30

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrSkipCustomType = errors.New("avro: skip custom type")

ErrSkipCustomType is returned from a CustomType Encode or Decode function to indicate the value is not handled by this custom type. The library falls through to the next matching custom type or to built-in behavior.

Functions

func CheckCompatibility

func CheckCompatibility(writer, reader *Schema) error

CheckCompatibility reports whether data written with the writer schema can be read by the reader schema. It returns nil on success or a *CompatibilityError describing the first incompatibility.

See Resolve for a note on argument order.

func NewRabin

func NewRabin() hash.Hash64

NewRabin returns a hash.Hash64 computing the CRC-64-AVRO (Rabin) fingerprint defined by the Avro specification.

func SingleObjectFingerprint

func SingleObjectFingerprint(data []byte) (fp [8]byte, rest []byte, err error)

SingleObjectFingerprint extracts the 8-byte CRC-64-AVRO fingerprint and returns the remaining payload.

Types

type CompatibilityError

type CompatibilityError struct {
	// Path is the dotted path to the incompatible element (e.g. "User.address.zip").
	Path string
	// ReaderType is the Avro type in the reader schema.
	ReaderType string
	// WriterType is the Avro type in the writer schema.
	WriterType string
	// Detail describes the specific incompatibility.
	Detail string
}

CompatibilityError describes an incompatibility between a reader and writer schema, as returned by CheckCompatibility and Resolve.

func (*CompatibilityError) Error

func (e *CompatibilityError) Error() string

type CustomType added in v1.3.0

type CustomType struct {
	// LogicalType narrows matching to schema nodes with this logicalType.
	LogicalType string

	// AvroType narrows matching to schema nodes of this Avro type
	// (e.g. "long", "bytes", "record"). Also used by SchemaFor to
	// infer the underlying Avro type.
	AvroType string

	// GoType adds an encode-time filter: when set, the Encode function
	// only fires when the value's concrete type matches GoType. Values
	// of other types pass through to the underlying serializer unchanged.
	// If nil, Encode fires for all values on matched schema nodes
	// (those matching LogicalType/AvroType).
	//
	// [SchemaFor] uses GoType to match struct fields: when a field's Go
	// type equals GoType, SchemaFor emits AvroType + LogicalType (or
	// Schema) instead of the default type mapping. If nil, the custom
	// type does not affect schema generation, but is still wired into
	// the returned [*Schema] for encode/decode.
	GoType reflect.Type

	// Schema is the full schema to emit in SchemaFor. Only needed for
	// types requiring extra metadata (fixed needs name+size, decimal
	// needs precision+scale, records need fields). If nil, SchemaFor
	// infers from AvroType + LogicalType.
	Schema *SchemaNode

	// Encode converts a custom Go value to an Avro-native value,
	// called before serialization. Return [ErrSkipCustomType] to fall
	// through to the next matching custom type or built-in behavior.
	// Any other non-nil error is fatal. If nil, default encoding is used.
	Encode func(v any, schema *SchemaNode) (any, error)

	// Decode converts an Avro-native value to a custom Go value,
	// called after deserialization. Return [ErrSkipCustomType] to fall
	// through. Any other non-nil error is fatal. If nil, default
	// decoding is used.
	Decode func(v any, schema *SchemaNode) (any, error)
	// contains filtered or unexported fields
}

CustomType defines a custom conversion between a Go type and an Avro type. Use this when you need full control over the type mapping — for example, to map a custom Go struct to/from an Avro fixed or record, to handle complex Avro types (records, arrays, maps) as backing types, or to dispatch on schema properties rather than logical type names. For simpler cases where the backing type is a primitive, prefer NewCustomType which infers the wiring from type parameters.

Pass to Parse or SchemaFor as a SchemaOpt.

Matching at parse time: LogicalType and AvroType are checked against schema nodes. All non-empty criteria must match.

  • LogicalType only: matches any schema node with that logicalType
  • LogicalType + AvroType: matches that logicalType on that Avro type
  • AvroType only: matches all nodes of that Avro type
  • Neither: matches ALL schema nodes (use with ErrSkipCustomType for property-based dispatch like Kafka Connect types)

At encode time, GoType is also checked: the Encode function only fires when the value's type matches GoType. This prevents the codec from intercepting native values (e.g. a raw int64 passes through without conversion for a custom-typed long field).

A matching custom type replaces the built-in logical type handler entirely. Encode and Decode callbacks receive raw Avro-native values (int32 for int, int64 for long, []byte for bytes/fixed, etc.), not the enriched types that built-in handlers produce (time.Time, time.Duration, etc.). Among user registrations, first match wins.

For custom types backed by complex Avro types (records, arrays, maps), use the struct form directly — the Encode function can return map[string]any, []any, etc. NewCustomType is limited to primitive backing types.

Example (Override)
package main

import (
	"fmt"

	"github.com/twmb/avro"
)

func main() {
	// Use CustomType directly to override a built-in logical type handler.
	// Here we suppress the timestamp-millis → time.Time conversion and
	// keep the raw int64 epoch millis.
	schema := avro.MustParse(`{
		"type": "record", "name": "Event",
		"fields": [
			{"name": "ts", "type": {"type": "long", "logicalType": "timestamp-millis"}}
		]
	}`, avro.CustomType{
		LogicalType: "timestamp-millis",
		Decode: func(v any, _ *avro.SchemaNode) (any, error) {
			return v, nil // pass through raw int64
		},
	})

	data, _ := schema.Encode(map[string]any{"ts": int64(1767225600000)})
	var out any
	schema.Decode(data, &out)
	m := out.(map[string]any)
	fmt.Printf("ts type: %T\n", m["ts"])
}
Output:
ts type: int64
Example (PropertyDispatch)
package main

import (
	"fmt"

	"github.com/twmb/avro"
)

func main() {
	// CustomType with no LogicalType/AvroType/GoType matches ALL schema
	// nodes. Use ErrSkipCustomType to selectively handle nodes based on
	// schema properties, e.g. Kafka Connect type annotations.
	ct := avro.CustomType{
		Decode: func(v any, node *avro.SchemaNode) (any, error) {
			if node.Props["connect.type"] == "double-it" {
				return v.(int64) * 2, nil
			}
			return nil, avro.ErrSkipCustomType
		},
	}

	// Properties on the type object are available via node.Props in the
	// custom type callback.
	schema := avro.MustParse(`{
		"type": "record", "name": "R",
		"fields": [
			{"name": "x", "type": {"type": "long", "connect.type": "double-it"}},
			{"name": "y", "type": "long"}
		]
	}`, ct)

	data, _ := schema.Encode(map[string]any{"x": int64(5), "y": int64(5)})
	var out any
	schema.Decode(data, &out)
	m := out.(map[string]any)
	fmt.Printf("x=%d y=%d\n", m["x"], m["y"])
}
Output:
x=10 y=5
Example (SchemaFor)
package main

import (
	"fmt"
	"log"
	"reflect"

	"github.com/twmb/avro"
)

func main() {
	// Setting GoType lets SchemaFor infer the Avro schema for struct
	// fields of that type. Without GoType, SchemaFor doesn't know that
	// a Cents field should map to {"type":"long","logicalType":"money"}.
	type Cents int64
	ct := avro.CustomType{
		LogicalType: "money",
		AvroType:    "long",
		GoType:      reflect.TypeFor[Cents](),
		Encode: func(v any, _ *avro.SchemaNode) (any, error) {
			return int64(v.(Cents)), nil
		},
		Decode: func(v any, _ *avro.SchemaNode) (any, error) {
			return Cents(v.(int64)), nil
		},
	}

	type Order struct {
		Price Cents `avro:"price"`
	}
	schema, err := avro.SchemaFor[Order](ct)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(schema.Root().Fields[0].Type.LogicalType)
}
Output:
money

func NewCustomType added in v1.3.0

func NewCustomType[G, A any](
	logicalType string,
	encode func(G, *SchemaNode) (A, error),
	decode func(A, *SchemaNode) (G, error),
) CustomType

NewCustomType returns a type-safe CustomType for the common case of mapping a custom Go type to/from a primitive Avro type. For example, use this to decode Avro longs into a domain-specific ID type, or to encode a Money type as Avro bytes with a "decimal" logical type.

G is the custom Go type (e.g. Money). A is the Avro-native Go type: int32 for int, int64 for long, float32 for float, float64 for double, string for string, []byte for bytes, bool for boolean.

GoType and AvroType are inferred from the type parameters. If A is not a supported Avro-native type, Parse or SchemaFor returns an error.

Note: AvroType is inferred from A's Go kind, which may not match the Avro schema's type for logical types backed by smaller types. For example, time-millis uses Avro "int" but time.Duration is int64 (which infers "long"). Use int32 as A, or use the CustomType struct directly with an explicit AvroType.

For fixed, records, or types needing extra schema metadata, use the CustomType struct directly.

Example
package main

import (
	"fmt"
	"log"

	"github.com/twmb/avro"
)

type ExMoney struct {
	Cents int64
}

func main() {
	// NewCustomType is the easiest way to map a custom Go type to/from a
	// primitive Avro type. The type parameters wire everything up:
	//   G = your Go type, A = the Avro-native Go type it maps to.
	//
	// A is the raw type on the wire:
	//   int32 → Avro int       float32 → Avro float     bool   → Avro boolean
	//   int64 → Avro long      float64 → Avro double    string → Avro string
	//   []byte → Avro bytes
	//
	// The first argument is the logicalType to match. Pass "" to match
	// all schema nodes of the inferred Avro type.
	moneyType := avro.NewCustomType[ExMoney, int64]("money",
		func(m ExMoney, _ *avro.SchemaNode) (int64, error) { return m.Cents, nil },
		func(c int64, _ *avro.SchemaNode) (ExMoney, error) { return ExMoney{Cents: c}, nil },
	)

	schema := avro.MustParse(`{
		"type": "record", "name": "Order",
		"fields": [
			{"name": "price", "type": {"type": "long", "logicalType": "money"}}
		]
	}`, moneyType)

	type Order struct {
		Price ExMoney `avro:"price"`
	}

	data, err := schema.Encode(&Order{Price: ExMoney{Cents: 1999}})
	if err != nil {
		log.Fatal(err)
	}

	var out Order
	if _, err := schema.Decode(data, &out); err != nil {
		log.Fatal(err)
	}
	fmt.Printf("%d cents\n", out.Price.Cents)
}
Output:
1999 cents

type Duration

type Duration struct {
	Months       uint32
	Days         uint32
	Milliseconds uint32
}

Duration represents the Avro duration logical type: a 12-byte fixed value containing three little-endian unsigned 32-bit integers representing months, days, and milliseconds.

func DurationFromBytes added in v1.3.0

func DurationFromBytes(b []byte) Duration

DurationFromBytes decodes a 12-byte little-endian fixed value into a Duration. Returns zero Duration if b is shorter than 12 bytes.

func (Duration) Bytes added in v1.3.0

func (d Duration) Bytes() [12]byte

Bytes encodes the Duration as a 12-byte little-endian fixed value, matching the Avro duration wire format.

func (Duration) String added in v1.3.0

func (d Duration) String() string

String returns an ISO 8601 duration string. Zero components are omitted for readability. Examples: "P1Y3M15DT1H30M0.500S", "P30D", "PT1H".

type Opt added in v1.3.0

type Opt interface {
	// contains filtered or unexported methods
}

Opt configures encoding and decoding behavior. See each option's documentation for which functions it affects. Inapplicable options are silently ignored.

func LinkedinFloats added in v1.3.0

func LinkedinFloats() Opt

LinkedinFloats encodes NaN as JSON null and ±Infinity as ±1e999 in Schema.EncodeJSON, matching the linkedin/goavro convention. Without this option, NaN is encoded as the JSON string "NaN" and ±Infinity as "Infinity"/"-Infinity", following the Java Avro convention. Schema.DecodeJSON always accepts both conventions regardless of this option.

func TagLogicalTypes added in v1.3.0

func TagLogicalTypes() Opt

TagLogicalTypes qualifies union branch names with their logical type (e.g. "long.timestamp-millis" instead of "long"). This applies to Schema.EncodeJSON with TaggedUnions and to Schema.Decode with TaggedUnions. Without this option, branch names use the base Avro type per the specification. This option has no effect without TaggedUnions.

func TaggedUnions added in v1.3.0

func TaggedUnions() Opt

TaggedUnions wraps non-null union values as {"type_name": value}.

In Schema.EncodeJSON, this produces tagged JSON union output. In Schema.Decode and Schema.DecodeJSON to *any, this wraps union values as map[string]any{branchName: value}.

Without this option, union values are bare in all cases. Schema.DecodeJSON always accepts both tagged and bare input regardless of this option.

type Schema

type Schema struct {
	// contains filtered or unexported fields
}

Schema is a compiled Avro schema. Create one with Parse or MustParse, then use Schema.Encode / Schema.Decode to convert between Go values and Avro binary. A Schema is safe for concurrent use.

func MustParse

func MustParse(schema string, opts ...SchemaOpt) *Schema

MustParse is like Parse but panics on error.

func MustSchemaFor added in v1.1.0

func MustSchemaFor[T any](opts ...SchemaOpt) *Schema

MustSchemaFor is like SchemaFor but panics on error.

func Parse

func Parse(schema string, opts ...SchemaOpt) (*Schema, error)

Parse parses an Avro JSON schema string and returns a compiled *Schema. The input can be a primitive name (e.g. `"string"`), a JSON object (record, enum, array, map, fixed), or a JSON array (union). Named types may self-reference. The schema is fully validated: unknown types, duplicate names, invalid defaults, etc. all return errors.

To parse schemas that reference named types from other schemas, use SchemaCache.

func Resolve

func Resolve(writer, reader *Schema) (*Schema, error)

Resolve returns a schema that decodes data written with the writer schema and produces values matching the reader schema's layout. The writer schema is what the data was encoded with (typically from an OCF file header or a schema registry); the reader schema is what your application expects now.

Decoding with the returned schema handles field addition (defaults), field removal (skip), renaming (aliases), reordering, and type promotion. Encoding with it uses the reader's format.

If the schemas have identical canonical forms, reader is returned as-is. Otherwise CheckCompatibility is called first and any incompatibility is returned as a *CompatibilityError. See the package-level documentation for a full example.

Note: the argument order is (writer, reader), matching source-then-destination convention and Java's GenericDatumReader. This differs from the Avro spec text and hamba/avro, which put reader first.

Example
package main

import (
	"fmt"
	"log"

	"github.com/twmb/avro"
)

func main() {
	// v1 wrote User with just a name.
	writerSchema := avro.MustParse(`{
		"type": "record", "name": "User",
		"fields": [{"name": "name", "type": "string"}]
	}`)

	// v2 added an email field with a default.
	readerSchema := avro.MustParse(`{
		"type": "record", "name": "User",
		"fields": [
			{"name": "name",  "type": "string"},
			{"name": "email", "type": "string", "default": ""}
		]
	}`)

	resolved, err := avro.Resolve(writerSchema, readerSchema)
	if err != nil {
		log.Fatal(err)
	}

	// Encode a v1 record (name only).
	v1Data, err := writerSchema.Encode(map[string]any{"name": "Alice"})
	if err != nil {
		log.Fatal(err)
	}

	// Decode old data into the new layout; email gets the default.
	type User struct {
		Name  string `avro:"name"`
		Email string `avro:"email"`
	}
	var u User
	if _, err := resolved.Decode(v1Data, &u); err != nil {
		log.Fatal(err)
	}
	fmt.Printf("name=%s email=%q\n", u.Name, u.Email)
}
Output:
name=Alice email=""

func SchemaFor added in v1.1.0

func SchemaFor[T any](opts ...SchemaOpt) (*Schema, error)

SchemaFor infers an Avro schema from the Go type T. T must be a struct.

Field names are taken from the avro struct tag, falling back to the Go field name. The following tag options are supported:

  • avro:"-" excludes the field
  • avro:",inline" flattens a nested struct's fields into the parent
  • avro:",omitzero" is recorded but does not affect the schema
  • avro:",alias=old_name" adds a field alias (repeatable)
  • avro:",default=value" sets the field's default value (must be last option; scalars only)
  • avro:",timestamp-millis" overrides the logical type (also: timestamp-micros, timestamp-nanos, date, time-millis, time-micros)
  • avro:",decimal(precision,scale)" sets the decimal logical type
  • avro:",uuid" sets the uuid logical type

Type inference:

  • bool → boolean
  • int8, int16, int32 → int
  • int, int64, uint32 → long
  • uint8, uint16 → int
  • float32 → float
  • float64 → double
  • string → string
  • []byte → bytes
  • [N]byte → fixed (size N)
  • *T → ["null", T] union
  • []T → array
  • map[string]T → map
  • struct → record (recursive)
  • time.Time → long with timestamp-millis (override with tag)
  • time.Duration → int with time-millis (override with tag)
  • *big.Rat → requires explicit decimal(p,s) tag
  • [16]byte with uuid tag → string with uuid logical type
Example
package main

import (
	"fmt"
	"log"
	"time"

	"github.com/twmb/avro"
)

func main() {
	type Event struct {
		ID     int64     `avro:"id"`
		Name   string    `avro:"name,default=unnamed"`
		Source string    `avro:"source,default=web"`
		Time   time.Time `avro:"ts"`
		Meta   *string   `avro:"meta"` // *T becomes ["null", T] union
	}

	schema := avro.MustSchemaFor[Event](avro.WithNamespace("com.example"))

	// Encode, then decode back.
	meta := "test"
	data, err := schema.Encode(&Event{
		ID:     1,
		Name:   "click",
		Source: "mobile",
		Time:   time.Date(2026, 1, 1, 0, 0, 0, 0, time.UTC),
		Meta:   &meta,
	})
	if err != nil {
		log.Fatal(err)
	}

	var out Event
	if _, err := schema.Decode(data, &out); err != nil {
		log.Fatal(err)
	}
	fmt.Printf("id=%d name=%s source=%s meta=%s\n", out.ID, out.Name, out.Source, *out.Meta)

	// Inspect the inferred schema.
	root := schema.Root()
	for _, f := range root.Fields {
		if f.HasDefault {
			fmt.Printf("field %s: default=%v\n", f.Name, f.Default)
		}
	}
}
Output:
id=1 name=click source=mobile meta=test
field name: default=unnamed
field source: default=web

func (*Schema) AppendEncode

func (s *Schema) AppendEncode(dst []byte, v any, opts ...Opt) ([]byte, error)

AppendEncode appends the Avro binary encoding of v to dst. See Schema.Decode for the Go-to-Avro type mapping. In addition to the types listed there, encoding also accepts:

Example
package main

import (
	"fmt"
	"log"

	"github.com/twmb/avro"
)

func main() {
	schema := avro.MustParse(`"string"`)

	// AppendEncode reuses a buffer across calls, avoiding allocation.
	var buf []byte
	var err error
	for _, s := range []string{"hello", "world"} {
		buf, err = schema.AppendEncode(buf[:0], s)
		if err != nil {
			log.Fatal(err)
		}
		fmt.Printf("encoded %q: %d bytes\n", s, len(buf))
	}
}
Output:
encoded "hello": 6 bytes
encoded "world": 6 bytes

func (*Schema) AppendEncodeJSON added in v1.3.0

func (s *Schema) AppendEncodeJSON(dst []byte, v any, opts ...Opt) ([]byte, error)

AppendEncodeJSON is like Schema.EncodeJSON but appends to dst.

func (*Schema) AppendSingleObject

func (s *Schema) AppendSingleObject(dst []byte, v any, opts ...Opt) ([]byte, error)

AppendSingleObject appends a Single Object Encoding of v to dst: 2-byte magic, 8-byte CRC-64-AVRO fingerprint, then the Avro binary payload.

Example
package main

import (
	"fmt"
	"log"

	"github.com/twmb/avro"
)

func main() {
	schema := avro.MustParse(`{
		"type": "record",
		"name": "Event",
		"fields": [
			{"name": "id",   "type": "long"},
			{"name": "name", "type": "string"}
		]
	}`)

	type Event struct {
		ID   int64  `avro:"id"`
		Name string `avro:"name"`
	}

	// Encode: 2-byte magic + 8-byte fingerprint + Avro payload.
	data, err := schema.AppendSingleObject(nil, &Event{ID: 1, Name: "click"})
	if err != nil {
		log.Fatal(err)
	}

	// Decode.
	var e Event
	if _, err := schema.DecodeSingleObject(data, &e); err != nil {
		log.Fatal(err)
	}
	fmt.Printf("id=%d name=%s\n", e.ID, e.Name)
}
Output:
id=1 name=click

func (*Schema) Canonical

func (s *Schema) Canonical() []byte

Canonical returns the Parsing Canonical Form of the schema, stripping doc, aliases, defaults, and other non-essential attributes. The result is deterministic and suitable for comparison and fingerprinting.

func (*Schema) Decode

func (s *Schema) Decode(src []byte, v any, opts ...Opt) ([]byte, error)

Decode reads Avro binary from src into v and returns the remaining bytes. v must be a non-nil pointer to a type compatible with the schema:

  • null: any (always decodes to nil)
  • boolean: bool, any
  • int, long: int, int8–int64, uint8–uint64, any
  • float: float32, float64, any
  • double: float64, float32, any
  • string: string, []byte, any; also encoding.TextUnmarshaler
  • bytes: []byte, string, any
  • enum: string, int/uint (ordinal), any
  • fixed: [N]byte, []byte, any
  • array: slice, any
  • map: map[string]T, any
  • union: any, *T (for ["null", T] unions), or the matched branch type
  • record: struct (matched by field name or `avro` tag), map[string]any, any

When decoding into *any, primitive types become nil, bool, int32, int64, float32, float64, string, []byte, []any, or map[string]any (for records). Logical types decode to their natural Go equivalents:

To produce JSON from decoded *any data, use Schema.EncodeJSON rather than encoding/json.Marshal. EncodeJSON is schema-aware and converts these types back to their Avro representations (e.g. time.Time to epoch integers, []byte to \uXXXX strings).

func (*Schema) DecodeJSON added in v1.2.0

func (s *Schema) DecodeJSON(src []byte, v any, opts ...Opt) error

DecodeJSON decodes Avro JSON from src into v. It unwraps union wrappers, converts bytes/fixed strings, and coerces numeric types to match the schema. When v is *any, the result is returned directly. For typed targets (structs, etc.), the value is round-tripped through binary encode/decode.

DecodeJSON also accepts the non-standard union branch naming used by linkedin/goavro (e.g. "long.timestamp-millis" instead of "long").

DecodeJSON accepts all input formats (tagged and bare unions, Java and goavro NaN/Infinity conventions). Pass TaggedUnions to wrap decoded union values when the target is *any.

Example
package main

import (
	"fmt"
	"log"

	"github.com/twmb/avro"
)

func main() {
	schema := avro.MustParse(`{
		"type": "record",
		"name": "User",
		"fields": [
			{"name": "name",  "type": "string"},
			{"name": "email", "type": ["null", "string"]}
		]
	}`)

	type User struct {
		Name  string  `avro:"name"`
		Email *string `avro:"email"`
	}

	// DecodeJSON accepts both bare and tagged union formats.
	var u1, u2 User
	if err := schema.DecodeJSON([]byte(`{"name":"Alice","email":"[email protected]"}`), &u1); err != nil {
		log.Fatal(err)
	}
	if err := schema.DecodeJSON([]byte(`{"name":"Bob","email":{"string":"[email protected]"}}`), &u2); err != nil {
		log.Fatal(err)
	}
	fmt.Printf("%s: %s\n", u1.Name, *u1.Email)
	fmt.Printf("%s: %s\n", u2.Name, *u2.Email)
}
Output:
Alice: [email protected]
Bob: [email protected]

func (*Schema) DecodeSingleObject

func (s *Schema) DecodeSingleObject(data []byte, v any, opts ...Opt) ([]byte, error)

DecodeSingleObject decodes a Single Object Encoding message into v after verifying the magic and fingerprint match this schema.

func (*Schema) Encode

func (s *Schema) Encode(v any, opts ...Opt) ([]byte, error)

Encode encodes v as Avro binary. It is shorthand for AppendEncode(nil, v).

Example (TextMarshaler)
package main

import (
	"fmt"
	"log"
	"net"

	"github.com/twmb/avro"
)

func main() {
	// Types implementing encoding.TextMarshaler are encoded as Avro
	// strings, and encoding.TextUnmarshaler types decode from them.
	schema := avro.MustParse(`{
		"type": "record",
		"name": "Server",
		"fields": [
			{"name": "name", "type": "string"},
			{"name": "ip",   "type": "string"}
		]
	}`)

	type Server struct {
		Name string `avro:"name"`
		IP   net.IP `avro:"ip"`
	}

	data, err := schema.Encode(&Server{
		Name: "web-1",
		IP:   net.IPv4(192, 168, 1, 1),
	})
	if err != nil {
		log.Fatal(err)
	}

	var out Server
	if _, err := schema.Decode(data, &out); err != nil {
		log.Fatal(err)
	}
	fmt.Printf("%s: %s\n", out.Name, out.IP)
}
Output:
web-1: 192.168.1.1

func (*Schema) EncodeJSON added in v1.2.0

func (s *Schema) EncodeJSON(v any, opts ...Opt) ([]byte, error)

EncodeJSON encodes v as JSON using the schema for type-aware encoding. By default, union values are written as bare JSON values and bytes/fixed fields use \uXXXX escapes for non-ASCII bytes. Options can modify the output format; see Opt for details.

NaN and Infinity float values are encoded as JSON strings "NaN", "Infinity", and "-Infinity" by default (Java Avro convention), or as null/±1e999 with LinkedinFloats. Standard encoding/json.Marshal cannot represent these values; use EncodeJSON instead.

EncodeJSON accepts the same Go types as Schema.Encode. Map key order in the output is non-deterministic, as with encoding/json.Marshal.

Example
package main

import (
	"fmt"
	"log"

	"github.com/twmb/avro"
)

func main() {
	schema := avro.MustParse(`{
		"type": "record",
		"name": "User",
		"fields": [
			{"name": "name",  "type": "string"},
			{"name": "email", "type": ["null", "string"]}
		]
	}`)

	type User struct {
		Name  string  `avro:"name"`
		Email *string `avro:"email"`
	}
	email := "[email protected]"
	u := User{Name: "Alice", Email: &email}

	// Default: bare union values.
	bare, err := schema.EncodeJSON(&u)
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(string(bare))

	// TaggedUnions: wrapped as {"type": value}.
	tagged, err := schema.EncodeJSON(&u, avro.TaggedUnions())
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(string(tagged))
}
Output:
{"name":"Alice","email":"[email protected]"}
{"name":"Alice","email":{"string":"[email protected]"}}

func (*Schema) Fingerprint

func (s *Schema) Fingerprint(h hash.Hash) []byte

Fingerprint hashes the schema's canonical form with h. Use NewRabin for CRC-64-AVRO or crypto/sha256 for cross-language compatibility.

The result is big-endian per hash.Hash.Sum. Single Object Encoding uses little-endian fingerprints; use Schema.DecodeSingleObject or SingleObjectFingerprint for that format.

func (*Schema) Root added in v1.2.0

func (s *Schema) Root() SchemaNode

Root returns the SchemaNode representation of a parsed schema by re-parsing the original schema JSON. This preserves all metadata including doc strings, namespaces, and custom properties.

Root re-parses the JSON on each call. Cache the result if you need to access it repeatedly (e.g. in a per-message processing loop).

func (*Schema) String

func (s *Schema) String() string

String returns the original JSON passed to Parse, preserving all attributes (doc, aliases, defaults, etc.) unlike Schema.Canonical.

type SchemaCache

type SchemaCache struct {
	// contains filtered or unexported fields
}

SchemaCache accumulates named types across multiple SchemaCache.Parse calls, allowing schemas to reference types defined in previously parsed schemas. This is useful for Schema Registry integrations where schemas have references to other schemas.

Schemas must be parsed in dependency order: referenced types must be parsed before the schemas that reference them.

Parsing the same schema string multiple times is allowed and returns the previously parsed result. This handles diamond dependencies in schema reference graphs (e.g. A→B→D, A→C→D) without requiring callers to track which schemas have already been parsed. Deduplication normalizes the JSON (whitespace and key order) but not the Avro canonical form: schemas that differ only in formatting are deduplicated, but differences in non-canonical fields like doc or aliases are not and will return a duplicate type error.

The returned *Schema from each Parse call is fully resolved and independent of the cache — it can be used for Schema.Encode and Schema.Decode without the cache.

The zero value is ready to use. A SchemaCache is safe for concurrent use.

Example
package main

import (
	"fmt"
	"log"

	"github.com/twmb/avro"
)

func main() {
	cache := new(avro.SchemaCache)

	// Parse the Address type first.
	if _, err := cache.Parse(`{
		"type": "record",
		"name": "Address",
		"fields": [
			{"name": "street", "type": "string"},
			{"name": "city",   "type": "string"}
		]
	}`); err != nil {
		log.Fatal(err)
	}

	// User references Address by name.
	schema, err := cache.Parse(`{
		"type": "record",
		"name": "User",
		"fields": [
			{"name": "name",    "type": "string"},
			{"name": "address", "type": "Address"}
		]
	}`)
	if err != nil {
		log.Fatal(err)
	}

	type Address struct {
		Street string `avro:"street"`
		City   string `avro:"city"`
	}
	type User struct {
		Name    string  `avro:"name"`
		Address Address `avro:"address"`
	}

	data, err := schema.Encode(&User{
		Name:    "Alice",
		Address: Address{Street: "123 Main St", City: "Springfield"},
	})
	if err != nil {
		log.Fatal(err)
	}

	var u User
	if _, err := schema.Decode(data, &u); err != nil {
		log.Fatal(err)
	}
	fmt.Printf("%s lives at %s, %s\n", u.Name, u.Address.Street, u.Address.City)
}
Output:
Alice lives at 123 Main St, Springfield

func (*SchemaCache) Parse

func (c *SchemaCache) Parse(schema string, opts ...SchemaOpt) (*Schema, error)

Parse parses a schema string, registering any named types (records, enums, fixed) in the cache. Named types from previous Parse calls are available for reference resolution. On failure, the cache is not modified.

type SchemaField added in v1.2.0

type SchemaField struct {
	Name       string         // field name
	Type       SchemaNode     // field schema
	Default    any            // default value (only meaningful when HasDefault is true)
	HasDefault bool           // true if a default value is defined in the schema
	Aliases    []string       // field aliases for schema evolution
	Order      string         // sort order: "ascending" (default), "descending", or "ignore"
	Doc        string         // documentation string
	Props      map[string]any // custom properties (any JSON value)
}

SchemaField represents a field in an Avro record schema.

type SchemaNode added in v1.2.0

type SchemaNode struct {
	Type        string // Avro type or named type reference
	LogicalType string // e.g. date, timestamp-millis, decimal, uuid; empty if none

	Name      string   // name for record, enum, fixed
	Namespace string   // namespace for named types
	Aliases   []string // alternate names for named types (record, enum, fixed)
	Doc       string   // documentation string

	Fields   []SchemaField // record fields
	Items    *SchemaNode   // array element schema
	Values   *SchemaNode   // map value schema
	Branches []SchemaNode  // union member schemas
	Symbols  []string      // enum symbols
	Size     int           // fixed byte size

	EnumDefault    string // default symbol for enum schema evolution
	HasEnumDefault bool   // true if an enum default is defined

	Precision int            // decimal precision
	Scale     int            // decimal scale
	Props     map[string]any // custom properties (any JSON value)
}

SchemaNode is a read-write representation of an Avro schema. It can be obtained from a parsed schema via Schema.Root, or constructed directly and converted to a *Schema via the SchemaNode.Schema method.

The Type field determines which other fields are relevant:

  • Primitives (null, boolean, int, long, float, double, string, bytes): LogicalType, Precision, Scale, and Props are optional. Other fields are ignored.
  • record/error: Name and Fields are required. Namespace, Doc, and Props are optional.
  • enum: Name and Symbols are required. Namespace, Doc, and Props are optional.
  • array: Items is required.
  • map: Values is required.
  • fixed: Name and Size are required. LogicalType, Precision, Scale, Namespace, and Props are optional.
  • union: Branches lists the member schemas.

A named type (record, enum, fixed) that has already been defined elsewhere in the schema can be referenced by setting Type to its full name (e.g. com.example.Address) with no other fields.

func (*SchemaNode) Schema added in v1.2.0

func (n *SchemaNode) Schema() (*Schema, error)

Schema parses the SchemaNode into a *Schema that can be used for encoding and decoding. Returns an error if the node is invalid.

type SchemaOpt added in v1.1.0

type SchemaOpt interface {
	// contains filtered or unexported methods
}

SchemaOpt configures schema construction via Parse, SchemaCache.Parse, or SchemaFor. Inapplicable options are silently ignored.

func WithCustomType added in v1.3.0

func WithCustomType(ct CustomType) SchemaOpt

WithCustomType registers a custom type conversion for use with Parse, SchemaCache.Parse, or SchemaFor. CustomType and NewCustomType both satisfy SchemaOpt directly, so this wrapper is optional — it exists for discoverability.

func WithLaxNames

func WithLaxNames(fn func(string) error) SchemaOpt

WithLaxNames relaxes name validation in Parse and SchemaCache.Parse, overriding the default requirement that names match the Avro strict name regex [A-Za-z_][A-Za-z0-9_]*. If fn is nil, only non-empty names are required. If fn is non-nil, it is called for each name component and should return an error for invalid names. Dot-separated fullnames are split before calling fn. Ignored by SchemaFor.

func WithName added in v1.1.0

func WithName(name string) SchemaOpt

WithName overrides the Avro record name in SchemaFor. By default the Go struct name is used. Ignored by Parse.

func WithNamespace added in v1.1.0

func WithNamespace(ns string) SchemaOpt

WithNamespace sets the Avro namespace for the top-level record in SchemaFor. Ignored by Parse.

type SemanticError

type SemanticError struct {
	// GoType is the Go type involved, if applicable.
	GoType reflect.Type
	// AvroType is the Avro schema type (e.g. "int", "record", "boolean").
	AvroType string
	// Field is the dotted path to the record field (e.g. "address.zip"),
	// if the error occurred within a record.
	Field string
	// Err is the underlying error.
	Err error
}

SemanticError indicates a Go type is incompatible with an Avro schema type during encoding or decoding.

func (*SemanticError) Error

func (e *SemanticError) Error() string

func (*SemanticError) Unwrap

func (e *SemanticError) Unwrap() error

type ShortBufferError

type ShortBufferError struct {
	// Type is what was being read (e.g. "boolean", "string", "uint32").
	Type string
	// Need is the number of bytes required (0 if unknown).
	Need int
	// Have is the number of bytes available.
	Have int
}

ShortBufferError indicates the input buffer is too short for the value being decoded.

func (*ShortBufferError) Error

func (e *ShortBufferError) Error() string

Directories

Path Synopsis
Package ocf implements Avro [Object Container Files] (OCF).
Package ocf implements Avro [Object Container Files] (OCF).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL