Skip to content

Add w3_validate_<type name> functions#101

Open
MartinSStewart wants to merge 6 commits into
lamdera-nextfrom
wire-validator-3
Open

Add w3_validate_<type name> functions#101
MartinSStewart wants to merge 6 commits into
lamdera-nextfrom
wire-validator-3

Conversation

@MartinSStewart

@MartinSStewart MartinSStewart commented Jun 11, 2026

Copy link
Copy Markdown
Member

This PR adds support for w3_validate_<type name>: <type name> -> Result String () functions. These are user defined functions that w3_decode_<type name> functions will use to determine if a custom type can safely be decoded.

Here's an example of what w3_validate looks like in real code

module PersonName exposing (PersonName(..), fromString, fromStringLossy, maxLength, toString)

import String.Nonempty exposing (NonemptyString(..))


type PersonName
    = PersonName NonemptyString


maxLength : number
maxLength =
    32


fromString : String -> Result String PersonName
fromString text =
    case String.trim text |> String.Nonempty.fromString of
        Just nonempty ->
            if String.Nonempty.length nonempty > maxLength then
                Err "Too long"

            else if String.Nonempty.any (\char -> char == '\n' || char == '\u{000D}' || char == '@') nonempty then
                Err "Name can't contain line breaks or @ symbol"

            else
                PersonName nonempty |> Ok

        Nothing ->
            Err "Can't be empty"


w3_validate_PersonName : PersonName -> Result String ()
w3_validate_PersonName (PersonName text) =
    fromString (String.Nonempty.toString text) |> Result.map (\_ -> ())


toString : PersonName -> String
toString (PersonName a) =
    String.Nonempty.toString a

Motivation behind this feature

If you have an opaque type that only can be constructed via a function that enforces some guarantee (perhaps a type Name = Name String that can't be longer than 100 characters) then you might expect that guarantee to hold when you serialize data. The reality however is that w3_decode functions circumvent any opaque type guarantees and just create a instance of the type directly. This means a hacker could manually create a ToBackend request that contains an opaque type that violates some guarantee that type is supposed to have. Since the backend is probably coded with the assumption that opaque type guarantees hold, this could lead to bugs or security vulnerabilities.

Drawbacks

This feature has a number of drawbacks.

  • The user can potentially mess up their w3_validate_<type name> implementation such that it returns Err "some error" for values that can be created normally. If those values get sent to the backend then the decoder will fail and the request will get silently dropped (the "some error" will get logged on the Lamdera server but that's it)
  • w3_validate functions will confuse tools like IDEs and elm-review (until patches are made) since they will appear like unused functions
  • The current implementation of this feature doesn't allow w3_validate function to be defined and a w3_decode/encode function to be referenced by the user in the same module. This could be solved but for now isn't due to implementation complexity. I think in practice it won't be an issue since the user rarely references w3_decode/encode functions and if they do, they can easily just do that in a separate module.

w3_unsafe_decode

  • There are now w3_unsafe_decode_<type name> functions that are generated along side w3_decode. These behave identically to how w3_decode behaved before this PR (or how w3_decode behaves now if the user has not written any w3_validate functions)
  • For Lamdera infrastructure, w3_unsafe_decode is intended to be always used except for ToBackend decoding. The reasoning is that we want to prevent a hacker from sending invalid data to the backend. But if we get invalid data some other way (perhaps the user has changed the validation function between migrations and old data is decoded with a new more strict w3_validate function) then it's better to just let it through since the alternative could be a backup failing to load or the backend failing to communicate with the frontend.
  • Note that we still need to make some changes to lamdera infrastructure so that they use w3_unsafe_decode since the code isn't all located in lamdera/compiler.

In conclusion, I'm not particularly happy with this feature, but I don't know of any better way to solve the motivating security issue. I've spoken with a couple people and brainstormed ideas and this seems like the best option.

claude added 6 commits May 27, 2026 20:37
…ders

When a module defines `w3_validate_<TypeName> : <TypeName> -> Result String ()`
for a custom type, the generated `w3_decode_<TypeName>` now calls it after
decoding: `Ok ()` decodes successfully, while `Err msg` logs the message via
Lamdera.Wire3.debug and fails the decode.

Each w3_validate_ function is verified at compile time: the named type must be a
custom type defined in the same module (not a type alias, and not missing), the
function must have a type annotation, and its signature must be exactly
`<TypeName> <tvars> -> Result String ()` using the type's declared variables.
Any violation is a compile error.

When a module contains validators, the generated wire functions are placed
after user definitions so the generated decoders can reference them.

Tests: a compiling fixture covering the codegen (plain tvar, number-constrained
tvar, and nested usage), plus compile-error fixtures for the undefined-type,
type-alias, missing-annotation, wrong-signature, and concrete-tvar cases.

https://claude.ai/code/session_01URzKFJwLCrv3r2W28bPiW6
Compile-success fixtures verifying the w3_validate hook behaves with recursive
custom types (no codegen loops, compiler crashes, or dependency-graph issues):

- Wire_Validate_Recursive: a directly self-referential type (Tree).
- Wire_Validate_RecursiveRecord: a custom type referencing a record that
  references it back (Node -> NodeData -> List Node), i.e. mutually recursive
  generated decoders plus the validator call.
- Wire_Validate_RecursiveExtra: recursion through Maybe (Chain), parameterised
  recursion through List validated with the type variable (Rose a), and mutual
  recursion between two separately-validated types (Ping/Pong).

https://claude.ai/code/session_01URzKFJwLCrv3r2W28bPiW6
The existing w3_validate tests are all compile-time: they check the generated
decoder type-checks and that bad validator definitions are rejected by the
compiler. None execute a decoder, so neither the Ok-accept nor the Err-reject
path was tested at runtime.

This adds an elm-test (run via the project's --compiler=lamdera harness, e.g.
`cd test/scenario-alltypes && npx elm-test --compiler=lamdera
tests/Wire3ValidateTest.elm`) that encodes Validated values and decodes them
back, asserting:

- values that pass w3_validate_Validated decode to `Just value`
- values that fail it are rejected and decode to `Nothing`

Encoding does not run validation, so a validation-failing value can be encoded
and then shown to fail on decode. The wire functions are imported from
Test.Wire_Validate (cross-module) on purpose.

https://claude.ai/code/session_01URzKFJwLCrv3r2W28bPiW6
Compile-error fixtures filling in gaps in the existing wireValidateErrors
suite:
  - WrongOkType: Result with Int as the Ok payload (not unit).
  - TvarRename: type Holder a validated as Holder b -- strict tvar-name match
    rejects alpha-renamed variables.
  - TvarSwap: type Pair a b validated as Pair b a -- multi-tvar order matters.
  - ArgCount: validator declared as a value (no argument).

Compile-success fixtures (added to wireTestFiles):
  - MultiTvar: type Pair a b with a validator using both type variables.
  - Phantom: type Phantom a (variable declared but unused in constructors).

Runtime tests (extending tests/Wire3ValidateTest.elm) covering behaviour the
compile tests can't observe:
  - Recursive Tree: validation runs at every node, so a deeply-nested invalid
    Leaf causes the whole decode to fail.
  - Container (a record containing Validated values): validation runs through
    aggregate fields, both for the direct field and for list entries.

https://claude.ai/code/session_01URzKFJwLCrv3r2W28bPiW6
…error

When a module defines a w3_validate_* function, the generated wire functions
are appended after the user's code so the generated decoders can call the
validators. That meant any user top-level code in the same module that also
referenced a generated w3_encode_*/w3_decode_* function produced a forward
reference, which crashed the type solver with an internal Map.! error.

Detect that case in addWireGenerations_ (only on the validator path) and emit a
normal compile error naming the offending definition and wire function. Adds a
total VarTopLevel collector (topLevelRefsInExpr) that handles every Expr_
constructor, since getLvars is specialised to generated code and errors on
shapes like multi-branch if.

Tests: two fixtures (encoder and decoder reference) asserting the new error.
…e_<T>

Validation should only run on attacker-controlled, backend-inbound data, not on
trusted data (persistence, evergreen migrations). So every custom type and alias
now gets two decoders:

  * w3_decode_<T>        validating: applies w3_validate_<T> if present and
                         recurses through the w3_decode_* chain (for the Lamdera
                         runtime to use on backend-inbound data).
  * w3_unsafe_decode_<T> non-validating: behaves like w3_decode_<T> did before
                         validation existed, recursing through the
                         w3_unsafe_decode_* chain.

The two chains are threaded through decoder codegen via a DecodeMode parameter.
The validating chain is byte-identical to the previous w3_decode output (only
the prefix and, for unions, the validator hook differ), so existing behaviour is
unchanged. Built-in decoders (decodeList etc.) are mode-agnostic and thread the
mode through their element decoder.

The trusted in-repo consumers are repointed at the unsafe chain to preserve
their pre-validation behaviour: the evergreen migration harness and backend-model
persistence reload.

Also adds w3_unsafe_decode_* stubs + exports and extends the getForeignSig
fallback to the new prefix. The validator-module reference check already covers
w3_unsafe_decode_* (names are derived from the generated defs).

Tests: Wire3ValidateTest now contrasts unsafe-decode-accepts-invalid vs
validating-decode-rejects; the full Test.Wire suite passes with all fixtures
compiling under both chains.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants