Commit Graph

21 Commits

Author SHA1 Message Date
62e86d5e8b
feat: time evaluation in trainer 2024-05-29 04:47:50 +08:00
e5c97ca8a1
docs: update README about docker relay network 2024-05-29 04:31:49 +08:00
bdaa1d4846
fix: training log 2024-05-29 04:25:15 +08:00
17949bd1a6
feat: add training Dockerfile 2024-05-29 04:15:35 +08:00
cb58840988
docs: change script for developing 2024-05-27 18:57:40 +08:00
cf00e840a0
docs: add README 2024-05-27 18:53:09 +08:00
0a287e3b46
feat: save() 2024-05-22 19:58:13 +08:00
20fd2fbe08 fix: move trainer 2024-05-20 15:29:02 +00:00
86e0c50a65 fix: global rank 2024-05-16 23:31:24 +08:00
d4b9aaa1d6 feat: change rank when multi machine training 2024-05-16 23:17:35 +08:00
874c160eae fix: delete matplotlib dependence 2024-05-16 23:09:35 +08:00
233bec6d1c feat: torchrun on single machine 2024-05-16 20:55:18 +08:00
24240f1c3a feat: trainer class for single/multi GPU 2024-05-16 20:25:56 +08:00
8f3253ff24 fix: single machine parallel training success 2024-05-16 16:28:22 +08:00
aedc6b46e9
feat: single machine DDP (test) 2024-05-16 15:59:20 +08:00
939aa6d92e
docs: add some hint comment 2024-05-15 20:55:08 +08:00
e7572347c9
docs: update .gitignore 2024-05-12 23:33:33 +08:00
953d3ce1ee
docs: add .gitignore 2024-05-12 23:32:40 +08:00
fc01163995
feat: train on single GPU 2024-05-12 01:48:39 +08:00
bf905e9e03
feat: dataset 2024-05-12 00:43:50 +08:00
de44cad219
feat: add dataset container's example dockerfile 2024-05-11 21:55:58 +08:00