-
Notifications
You must be signed in to change notification settings - Fork 895
Pull requests: kubeflow/trainer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat: add Megatron-Core GPT Tensor Parallelism example notebook
size/L
#3201
opened Feb 13, 2026 by
XploY04
Loading…
1 task
feat(runtimes): Add XGBoost runtime(KEP-2598)
kind/feature
size/L
#3200
opened Feb 12, 2026 by
Krishna-kg732
Loading…
4 tasks
[WIP] feat(api): Replace PodTemplateOverrides with TemplateOverrides
do-not-merge/work-in-progress
size/L
#3199
opened Feb 10, 2026 by
andreyvelich
Loading…
chore(deps): bump transformers from 4.57.6 to 5.1.0 in /cmd/runtimes/deepspeed
dependencies
Pull requests that update a dependency file
python
Pull requests that update Python code
size/XS
#3197
opened Feb 10, 2026 by
dependabot
bot
Loading…
chore(deps): update huggingface-hub requirement from <1.4,>=0.27.0 to >=0.27.0,<1.5 in /cmd/initializers/dataset
dependencies
Pull requests that update a dependency file
python
Pull requests that update Python code
size/XS
#3194
opened Feb 10, 2026 by
dependabot
bot
Loading…
feat(runtimes): Add KAI Scheduler plugin for gang-scheduling support
ok-to-test
size/XL
#3186
opened Feb 8, 2026 by
Raakshass
Loading…
8 tasks done
chore: unit tests for runtime registry and RuntimeInfo error handling
size/M
#3168
opened Feb 3, 2026 by
Goku2099
Loading…
feat(release): Automated trainer release process
do-not-merge/work-in-progress
size/XL
#3148
opened Jan 28, 2026 by
milinddethe15
•
Draft
1 task
feat: add validation for reserved MPI environment variables
size/M
#3145
opened Jan 28, 2026 by
VishalPainjane
Loading…
4 of 5 tasks
chore(deps): bump torch from 2.9.1 to 2.10.0 in /cmd/runtimes/deepspeed
dependencies
Pull requests that update a dependency file
python
Pull requests that update Python code
size/XS
#3133
opened Jan 27, 2026 by
dependabot
bot
Loading…
chore(deps): bump pytorch/pytorch from 2.9.1-cuda12.8-cudnn9-runtime to 2.10.0-cuda12.8-cudnn9-runtime in /cmd/trainers/torchtune
dependencies
Pull requests that update a dependency file
docker
Pull requests that update docker code
size/XS
#3130
opened Jan 27, 2026 by
dependabot
bot
Loading…
fix: allow atomic update of podTemplateOverrides when unsuspending TrainJob
kind/bug
kind/feature
ok-to-test
size/M
#3122
opened Jan 24, 2026 by
NarayanaSabari
Loading…
feat: Add CI pipeline to validate manifests and helm chart
ok-to-test
size/M
#3101
opened Jan 18, 2026 by
juyterman1000
Loading…
feat(runtimes): Support Elastic PyTorch in TrainJob
do-not-merge/hold
size/L
#3099
opened Jan 17, 2026 by
SoumyaRaikwar
Loading…
feat(runtimes): add AMD ROCm torch distributed runtime ref #2335
size/S
#3097
opened Jan 16, 2026 by
JEETDESAI25
Loading…
feat(api): add immutability for TrainingRuntimes types
size/XXL
#3082
opened Jan 11, 2026 by
Misha6Sharma
Loading…
1 task
chore(examples): add GPU passthrough support to container backend example
size/L
#3075
opened Jan 6, 2026 by
muzzlol
Loading…
feat(docs): proposal for adding TTLSecondsAfterFinished and ActiveDeadlineSeconds fields to TrainJob CRD
size/L
#3068
opened Jan 5, 2026 by
XploY04
Loading…
feat: add TTLSecondsAfterFinished and ActiveDeadlineSeconds fields to TrainJob CRD
size/XXL
#3065
opened Jan 4, 2026 by
XploY04
Loading…
feat: support for Flux Framework as HPC manager
size/XXL
#3064
opened Jan 4, 2026 by
vsoch
Loading…
1 task done
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.