Description
What problem are you trying to solve?
Support configuring EFA for static NodePools with heterogeneous instance types.
Today ( with v1.11.0 +) Karpenter supports launching EC2 instances with EFA interfaces with both dyanmic and static capacity.
- Pods can request
vpc.amazonaws.com/efa resources, and nodes launched for this pod will have EFA interfaces configured on all network cards. This can only be used in dynamic provisioning cases.
- The EC2NodeClass spec.networkInterfaces field can be configured with EFA-only interfaces. This can be used with static provisioning or dynamic provisioning.
The configuration of EFA devices from the EC2NodeClass spec.networkInterfaces field does not blend well with non-homogenous instance type NodePools as network topology of instance types can vary.
This is something we considered in the design. Workloads that run on EFA-enabled instances, AI / ML, often are used with single instance types (e.g. p5.48xlarge).
One option for a path forward is exposing an enum field spec.interfacePolicy - https://github.com/aws/karpenter-provider-aws/blob/main/designs/efa-for-static-capacity.md#other-design-considerations---interfacepolicy.
This would require statically generated network interface configurations for EFA + IP bandwidth opimized and IP optimized (all EFA-only except primary network interface) for each release, similar to pricing, ENI trunking, etc. This presents a challenge as there is no codified configurations nor API for best practices in EFA configurations.
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Description
What problem are you trying to solve?
Support configuring EFA for static NodePools with heterogeneous instance types.
Today ( with
v1.11.0 +) Karpenter supports launching EC2 instances with EFA interfaces with both dyanmic and static capacity.vpc.amazonaws.com/efaresources, and nodes launched for this pod will have EFA interfaces configured on all network cards. This can only be used in dynamic provisioning cases.The configuration of EFA devices from the EC2NodeClass
spec.networkInterfacesfield does not blend well with non-homogenous instance type NodePools as network topology of instance types can vary.This is something we considered in the design. Workloads that run on EFA-enabled instances, AI / ML, often are used with single instance types (e.g. p5.48xlarge).
One option for a path forward is exposing an enum field
spec.interfacePolicy- https://github.com/aws/karpenter-provider-aws/blob/main/designs/efa-for-static-capacity.md#other-design-considerations---interfacepolicy.This would require statically generated network interface configurations for EFA + IP bandwidth opimized and IP optimized (all EFA-only except primary network interface) for each release, similar to pricing, ENI trunking, etc. This presents a challenge as there is no codified configurations nor API for best practices in EFA configurations.