Skip to content

PHP RFC: Bound-Erased Generic Types#21969

Open
azjezz wants to merge 1 commit into
php:masterfrom
carthage-software:rfc/bound-erased-generic-types
Open

PHP RFC: Bound-Erased Generic Types#21969
azjezz wants to merge 1 commit into
php:masterfrom
carthage-software:rfc/bound-erased-generic-types

Conversation

@azjezz
Copy link
Copy Markdown

@azjezz azjezz commented May 6, 2026

RFC: https://wiki.php.net/rfc/bound_erased_generic_types

Note: This PR has been squashed, full history can be found in here

Copy link
Copy Markdown
Member

@DanielEScherzer DanielEScherzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not even going to pretend like I understand this, or even read it all, but some initial notes, I hope you don't mind

I very much hope that this RFC gets to a point where it is voted on and hopefully passes, generics are great!

You may want to consider adding some kind of overview for how the implementation logic works (e.g. a new file in docs/source) to aid reviewers in understanding how all of the parts go together

You may want to also split this up, it currently adds support for

Classes, interfaces, traits, functions, methods, closures, and arrow functions

and an initial version might just support classes, interfaces, and methods. Just a thought, please don't take this as criticism - I really hope that something like this is added to PHP

Comment thread ext/reflection/php_reflection.stub.php Outdated
Comment thread ext/reflection/php_reflection.stub.php Outdated
Comment thread Zend/zend_ast.c Outdated
Comment thread Zend/zend_compile.c Outdated
Comment thread Zend/zend_compile.c Outdated
Comment thread Zend/zend_compile.c Outdated
Comment thread Zend/zend_compile.h Outdated
Comment thread Zend/zend_compile.h Outdated
Comment thread Zend/zend_extensions.h Outdated
Comment thread Zend/zend_modules.h Outdated
@azjezz
Copy link
Copy Markdown
Author

azjezz commented May 6, 2026

Hi @DanielEScherzer

Thanks for the review!

On the internal docs: i'd lean toward holding off for now. If the RFC doesn't pass, we'd just throw the prose away; if it does, the API/internals will likely shift between now and acceptance, and I'd rather not rewrite the doc multiple times along the way. Once the RFC lands (hoping it does), I'm happy to write a proper INTERNALS section in one go against the final shape. or do you think there's value in having something now? (for review?)

@DanielEScherzer
Copy link
Copy Markdown
Member

Hi @DanielEScherzer

Thanks for the review!

On the internal docs: i'd lean toward holding off for now. If the RFC doesn't pass, we'd just throw the prose away; if it does, the API/internals will likely shift between now and acceptance, and I'd rather not rewrite the doc multiple times along the way. Once the RFC lands (hoping it does), I'm happy to write a proper INTERNALS section in one go against the final shape. or do you think there's value in having something now? (for review?)

I was thinking that this would be for reviewers to understand the rest of the patch - you know it best, since you wrote it, but others could use a roadmap/overview. I don't even known where to start for giving an example of "something like ..." for this, so I'll give one for #16952 (which I know best because I wrote it)

For supporting attributes on constants

  • the grammar for the language is adjust to allow attributes on constants, moving constant declarations into attributed_statement which can have (but doesn't need) attributes
  • the AST is stored in the same list of children as the constants in the declaration, as the last child
  • zend_compile_const_decl() will compile the attributes into a normal hashtable with zend_compile_attributes, and then emit a ZEND_DECLARE_ATTRIBUTED_CONST with a data znode where the data is a pointer to that table
  • at runtime, the VM handler will call zend_constant_add_attributes() to add the attributes

For supporting the deprecated attribute specifically

  • zend_constant_add_attributes() will flag the constant with CONST_DEPRECATED if the deprecated attribute is present
  • the existing logic to emit deprecation warnings for deprecated internal constants is instead replaced with calls to the new zend_deprecated_constant() function which handles reading for attributes if present

Just to have an understanding of how all of the pieces fit together

@azjezz
Copy link
Copy Markdown
Author

azjezz commented May 6, 2026

Makes sense. Will add it tomorrow morning 👍🏼

@azjezz azjezz force-pushed the rfc/bound-erased-generic-types branch 4 times, most recently from ed79328 to 4bfba49 Compare May 10, 2026 18:08
Comment thread Zend/zend_vm_def.h
@@ -8928,6 +8928,41 @@ ZEND_VM_HOT_HANDLER(211, ZEND_TYPE_ASSERT, CONST, ANY, NUM)
ZEND_VM_NEXT_OPCODE_CHECK_EXCEPTION();
}

ZEND_VM_HANDLER(212, ZEND_VERIFY_GENERIC_ARGUMENTS, TMP|UNUSED, UNUSED)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried multiple approachs here to do the arity/bound check, and i think this is the best option.

the other options i tried, but did not work:

  1. pack the arity byte into a spare slot in ZEND_INIT_FCALL / ZEND_NEW: this was the original plan, and it was the main reason why the 255 cap limit on the generic paraemters was introduced ( although, kept because its good to have anyway ), however, the full 32 bit operand on those opcodes is consumed by the value-arg count, narrowing it to 24 bits would have given us a byte for the generic arity, but that would be a BC break i assume ( even if no one is realistically calling a function with over 16 million params ).
  2. side table per call frame: keyed off a flag on zend_function, measurable regression on the bench suite ( Zend/bench.php - ~1.8% ) from a cold-cacheline read on every call
  3. A fn_flags bit: same idea as 2, smaller but still measurable regression ( ~1.2 ) because the bit is read on every call regardless of turbofish use.

so a dedicated opcode emitted only at turbofish sites is the "simplest" approach here, and does not effect performance.

Comment thread Zend/zend_inheritance.c
@@ -1245,6 +1618,60 @@ static inheritance_status do_inheritance_check_on_method(
}
/* }}} */

static zend_function *zend_maybe_substitute_inherited_method(
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the substitution sites for inherited members (this one, do_inherit_property, the property hook signatures it walks, zend_do_traits_property_binding, zend_add_trait_method) all follow the same pattern: clone the parent's zend_property_info or zend_function, allocate a fresh arg_info block in the child's arena, share the body opcodes via the existing refcount mechanism. Body opcodes are not re-emitted per child.

the cost is a documented runtime laxity: the body's VERIFY_* opcodes were laid down at the parent's compile time against the parent's unsubstituted view, so the child sees the substituted signature on the clone but the original opcodes inside.

The alternative would be bridge methods that re-verify the substituted return type before calling the shared body, which costs a permanent per-substitution call-frame on every inherited method invocation.

future work (body-level monomorphization) can close this gap.

see:

Comment thread Zend/zend_inheritance.c
zend_variance_walk_function(NULL, op_array->generic_parameters, op_array);
}

static void zend_check_generic_link_bounds(
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zend_check_generic_link_bounds validates each supplied arg against the corresponding target parameter's bound. It has a special case for forwarded class-scope T refs: when the supplied arg is a leaf class-scope T ref of the inheriting class, the effective arg type is the inheriting class's own bound on that parameter, the check becomes "ce's bound on Y must satisfy target's bound on T".

The reading is strict: if ce has no bound on Y (the parameter is unbounded, hence mixed), the check fails for any target bound stricter than mixed. An unbounded child parameter cannot be forwarded into a bounded ancestor slot.

The relaxed alternative would be to let an unbounded child parameter inherit the parent's bound implicitly when forwarded. Implicit inheritance of bounds across class boundaries is harder to model both for the compiler and for human readers, and yields a worse error message when the user forgets to write a bound they meant to write.

e.g:

class A<T: string|int|float> {}

class B<Y> extends A<Y> {}

strict: fail "mixed vs string|int|float"
relaxed: allow, Y is now implicitly bound to string|int|float

Comment thread Zend/Optimizer/optimize_func_calls.c Outdated
Comment on lines +140 to +143
if (zend_call_has_generic_arguments_check(opline - 1)) {
/* The verify opcode must run; inlining would orphan it. */
return;
}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zend_try_inline_call inlines simple function calls by replacing their INIT_FCALL and DO_FCALL opcodes with NOP.

e.g. before the optimizer:

INIT_FCALL                ; allocates the call frame
VERIFY_GENERIC_ARGUMENTS  ; reads EX(call)->func
DO_FCALL

after inlining without the guard:

NOP                       ; frame never set up
VERIFY_GENERIC_ARGUMENTS  ; EX(call)->func walks an unrelated frame causing segfault
NOP

The constraint is permanent: any future pass that elides an INIT_FCALL/DO_FCALL pair has to consider whether intervening opcodes depend on the call frame.

@azjezz azjezz marked this pull request as ready for review May 10, 2026 19:06
@azjezz azjezz requested review from dstogov and kocsismate as code owners May 10, 2026 19:06
@azjezz azjezz force-pushed the rfc/bound-erased-generic-types branch 10 times, most recently from 689c568 to c26a1a5 Compare May 11, 2026 09:30
@prateekbhujel
Copy link
Copy Markdown
Contributor

I checked the failing CircleCI ARM job.

Current failure is only:

Zend/tests/generics/inheritance/diamond/trait_diamond_variadic.phpt

It prints the reflected type correctly:

param: string|int
is union: yes

and then segfaults when the merged trait method is called: (new C)->sink(1, "x", 2, "y").

I built the branch locally on macOS arm64 with debug + ZTS and the same test passes for me, both plain and with opcache enabled. So this looks more like Linux ARM / JIT / merged arg_info lifetime than a test expectation issue.

The useful split is probably to rerun that one test on the ARM worker with JIT disabled. If it only crashes with JIT, the bad path is likely the merged trait method clone being executable but not looking quite like a normal trait clone to the runtime/JIT.

@azjezz
Copy link
Copy Markdown
Author

azjezz commented May 11, 2026

@prateekbhujel reproduced it locally in aarch64 Ubuntu 24.04 Docker container. pushing a fix now :)

@azjezz azjezz force-pushed the rfc/bound-erased-generic-types branch from c26a1a5 to c18f989 Compare May 11, 2026 21:43
@morrisonlevi
Copy link
Copy Markdown
Contributor

morrisonlevi commented May 11, 2026

It's a big RFC so sorry if this is working-as-intended, but I seem to have found a bug.

Reproducer code

src/sequence.php:

<?php declare(strict_types=1);

interface Sequence<+T>
{
}

src/arraysequence.php:

<?php declare(strict_types=1);

final class ArraySequence<+T>
    implements Sequence<T>
{
}

fails.php:

<?php declare(strict_types=1);

set_include_path(__DIR__ . '/src' . PATH_SEPARATOR . get_include_path());
spl_autoload_register();

// class_exists(Sequence::class);
class_exists(ArraySequence::class);

echo "unreachable\n";

Run it:

php -d display_errors=1 fails.php

Results in:

Fatal error: Type parameter T declared covariant (+T) cannot appear in invariant position in <redacted>/src/arraysequence.php on line 3
Stack trace:
#0 [internal function]: spl_autoload('ArraySequence')
#1 <redacted>/fails.php(6): class_exists('ArraySequence')
#2 {main}

If you un-comment the line:

// class_exists(Sequence::class);

Then it works.

It's a bug because T in class ArraySequence<+T> implements Sequence<T> is covariant and so is the T in interface Sequence<+T>, so they should be compatible.

@azjezz
Copy link
Copy Markdown
Author

azjezz commented May 11, 2026

@morrisonlevi looking into it!

@azjezz azjezz force-pushed the rfc/bound-erased-generic-types branch from c18f989 to 45189ee Compare May 11, 2026 23:53
@azjezz
Copy link
Copy Markdown
Author

azjezz commented May 11, 2026

@morrisonlevi fixed!

Copy link
Copy Markdown
Contributor

@morrisonlevi morrisonlevi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found another issue. if we use namespace/ syntax to fully qualify something, it loses generics in reflection.

reflection.php
<?php declare(strict_types=1);

namespace X;

interface Iface<+A>
{
}

trait Mixin<+A>
{
}

class FqnImplements
    implements \X\Iface<int>
{
}

class NamespaceImplements
    implements namespace\Iface<int>
{
}

class FqnUsesTrait
{
    use \X\Mixin<int>;
}

class NamespaceUsesTrait
{
    use namespace\Mixin<int>;
}

/**
 * @param list<\ReflectionNamedType> $args
 */
function render_args(array $args): string
{
    if ($args === []) {
        return '[]';
    }

    return '[' . \implode(', ', \array_map(
        static fn(\ReflectionNamedType $arg): string => $arg->getName(),
        $args,
    )) . ']';
}

function show(string $label, array $args): void
{
    \printf("%s: count=%d args=%s\n", $label, \count($args), render_args($args));
}

$fqn_implements_args = (new \ReflectionClass(FqnImplements::class))
    ->getGenericArgumentsForParentInterface(Iface::class);
$namespace_implements_args = (new \ReflectionClass(NamespaceImplements::class))
    ->getGenericArgumentsForParentInterface(Iface::class);

$fqn_trait_args = (new \ReflectionClass(FqnUsesTrait::class))
    ->getGenericArgumentsForUsedTrait(Mixin::class);
$namespace_trait_args = (new \ReflectionClass(NamespaceUsesTrait::class))
    ->getGenericArgumentsForUsedTrait(Mixin::class);

show('implements FQN', $fqn_implements_args);
show('implements namespace-relative', $namespace_implements_args);
show('trait use FQN', $fqn_trait_args);
show('trait use namespace-relative', $namespace_trait_args);

if (\count($namespace_implements_args) !== 1 || \count($namespace_trait_args) !== 1) {
    \fwrite(\STDERR, "BUG: namespace-relative inheritance clauses compile but lose generic arguments in reflection.\n");
    exit(1);
}

@azjezz azjezz force-pushed the rfc/bound-erased-generic-types branch from 45189ee to 6faad45 Compare May 12, 2026 01:18
@azjezz
Copy link
Copy Markdown
Author

azjezz commented May 12, 2026

@morrisonlevi fixed! :D I have added both to test suite, please report any other edge cases you might encounter.

@azjezz azjezz force-pushed the rfc/bound-erased-generic-types branch 2 times, most recently from a38781a to 8d12f23 Compare May 12, 2026 01:48
@prateekbhujel
Copy link
Copy Markdown
Contributor

Small follow-up on the optimizer note above: I pushed the patch to my fork so it is not just a pasted diff.

Commit: prateekbhujel@ccb50d3

It is based on the current RFC head (a38781a) and only skips the backward scan when the current op_array has no turbofish args table. If you prefer a PR against the RFC branch, I can open that instead.

@azjezz
Copy link
Copy Markdown
Author

azjezz commented May 12, 2026

@prateekbhujel feel free to open a PR against carthage-software:rfc/bound-erased-generic-types-history branch ( the working branch ), this branch ( PR ) is squashed from that one.

@prateekbhujel
Copy link
Copy Markdown
Contributor

Opened it against the working branch here: carthage-software#2

@azjezz azjezz force-pushed the rfc/bound-erased-generic-types branch from 8d12f23 to 6c895fd Compare May 12, 2026 02:07
@prateekbhujel
Copy link
Copy Markdown
Contributor

Opened one more small PR against the working branch for a Reflection edge case I hit while testing transitive interfaces:

carthage-software#3

It fixes ReflectionClass::getGenericArgumentsForParentInterface() returning an empty list when the generic ancestor is reached through a non-generic intermediate interface/class.

@azjezz azjezz force-pushed the rfc/bound-erased-generic-types branch from 6c895fd to d4ba7a6 Compare May 12, 2026 02:16
Signed-off-by: azjezz <azjezz@protonmail.com>
@azjezz azjezz force-pushed the rfc/bound-erased-generic-types branch from d4ba7a6 to ed70992 Compare May 12, 2026 02:26
@prateekbhujel
Copy link
Copy Markdown
Contributor

Opened a small PR for the current CI failure here: carthage-software#4

The failing test was creating an inherited-method arg_info clone even though the method-level Box<O> return did not actually substitute any class-scope slot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants