|
|
17949bd1a6
|
feat: add training Dockerfile
|
2024-05-29 04:15:35 +08:00 |
|
|
|
0a287e3b46
|
feat: save()
|
2024-05-22 19:58:13 +08:00 |
|
|
|
20fd2fbe08
|
fix: move trainer
|
2024-05-20 15:29:02 +00:00 |
|
|
|
86e0c50a65
|
fix: global rank
|
2024-05-16 23:31:24 +08:00 |
|
|
|
d4b9aaa1d6
|
feat: change rank when multi machine training
|
2024-05-16 23:17:35 +08:00 |
|
|
|
874c160eae
|
fix: delete matplotlib dependence
|
2024-05-16 23:09:35 +08:00 |
|
|
|
233bec6d1c
|
feat: torchrun on single machine
|
2024-05-16 20:55:18 +08:00 |
|
|
|
24240f1c3a
|
feat: trainer class for single/multi GPU
|
2024-05-16 20:25:56 +08:00 |
|
|
|
8f3253ff24
|
fix: single machine parallel training success
|
2024-05-16 16:28:22 +08:00 |
|
|
|
aedc6b46e9
|
feat: single machine DDP (test)
|
2024-05-16 15:59:20 +08:00 |
|
|
|
939aa6d92e
|
docs: add some hint comment
|
2024-05-15 20:55:08 +08:00 |
|
|
|
fc01163995
|
feat: train on single GPU
|
2024-05-12 01:48:39 +08:00 |
|
|
|
bf905e9e03
|
feat: dataset
|
2024-05-12 00:43:50 +08:00 |
|