Commit 312daaa
authored
Accelerate some adjustment for mixed precision (#1009)
* Use accelerator.autocast when computing loss
According to the accelerate docs, loss computation should be performed
within the accelerator.autocast context manager:
https://huggingface.co/docs/accelerate/v0.21.0/en/quicktour#mixed-precision-training
I tested if this makes a difference by running the following notebook
with fp16 precision:
https://nbviewer.org/github/skorch-dev/skorch/blob/master/notebooks/Hugging_Face_Finetuning.ipynb
I found no difference at all: The runtime was practially the same and
the losses were identical. Still, I think it's better to have this than
not, as it is recommended by the accelerate docs.
* Update LR scheduler callback to work w/ accelerate
According to the accelerate docs:
https://huggingface.co/docs/accelerate/quicktour#mixed-precision-training
the LR scheduler step should sometimes be skipped when using mixed
precision training because accelerate may skip update steps internally.
Therefore, I updated the LR scheduler callback to check if the net has
an accelerator and if it does, to check if a step is necessary.
This is actually quite hard to test because the necessity of stepping
depends on accelerate's internal logic, which we don't want to test, and
which might change in the future. Therefore, the added test just runs
training with accelerate, mixed precision, and some lr schedulers,
verifying that there is no error.
When running these tests + the normal lr scheduler tests locally on a
machine that supports fp16, I get 100% line coverage of lr_scheduler.py.
I think this is good enough.
* Non-functional clean ups related to lr schedulers
While working on the fixes in this PR, I also cleaned up some lr
scheduler code. These clean ups are non-functional.
1. We imported CyclicLR as TorchCyclicLR. I'm not sure why but it is
somehow related to very old PyTorch versions we no longer support, so I
removed this.
2. Fixed some indentations for conditional checks to improve
readability.
* Reviewer comment: Simplify conditional code1 parent 07fc260 commit 312daaa
4 files changed
Lines changed: 102 additions & 29 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
12 | 13 | | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | 18 | | |
24 | 19 | | |
25 | 20 | | |
| |||
142 | 137 | | |
143 | 138 | | |
144 | 139 | | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
145 | 165 | | |
146 | 166 | | |
147 | 167 | | |
| |||
158 | 178 | | |
159 | 179 | | |
160 | 180 | | |
161 | | - | |
| 181 | + | |
162 | 182 | | |
163 | 183 | | |
164 | | - | |
165 | | - | |
166 | | - | |
167 | | - | |
168 | | - | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
169 | 190 | | |
170 | 191 | | |
171 | 192 | | |
172 | 193 | | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
177 | | - | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
178 | 201 | | |
179 | 202 | | |
180 | 203 | | |
181 | 204 | | |
182 | 205 | | |
183 | 206 | | |
184 | | - | |
185 | | - | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
186 | 211 | | |
187 | 212 | | |
188 | 213 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1005 | 1005 | | |
1006 | 1006 | | |
1007 | 1007 | | |
1008 | | - | |
1009 | | - | |
1010 | | - | |
| 1008 | + | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
1011 | 1012 | | |
1012 | 1013 | | |
1013 | 1014 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
7 | 6 | | |
8 | 7 | | |
9 | 8 | | |
| |||
12 | 11 | | |
13 | 12 | | |
14 | 13 | | |
15 | | - | |
| 14 | + | |
16 | 15 | | |
17 | 16 | | |
18 | 17 | | |
| |||
28 | 27 | | |
29 | 28 | | |
30 | 29 | | |
31 | | - | |
| 30 | + | |
32 | 31 | | |
33 | 32 | | |
34 | 33 | | |
| |||
96 | 95 | | |
97 | 96 | | |
98 | 97 | | |
99 | | - | |
| 98 | + | |
100 | 99 | | |
101 | 100 | | |
102 | 101 | | |
| |||
125 | 124 | | |
126 | 125 | | |
127 | 126 | | |
128 | | - | |
| 127 | + | |
129 | 128 | | |
130 | 129 | | |
131 | 130 | | |
| |||
177 | 176 | | |
178 | 177 | | |
179 | 178 | | |
180 | | - | |
| 179 | + | |
181 | 180 | | |
182 | 181 | | |
183 | 182 | | |
| |||
212 | 211 | | |
213 | 212 | | |
214 | 213 | | |
215 | | - | |
| 214 | + | |
216 | 215 | | |
217 | 216 | | |
218 | 217 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
802 | 802 | | |
803 | 803 | | |
804 | 804 | | |
| 805 | + | |
805 | 806 | | |
806 | 807 | | |
807 | 808 | | |
| |||
826 | 827 | | |
827 | 828 | | |
828 | 829 | | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
829 | 835 | | |
830 | 836 | | |
831 | 837 | | |
| |||
950 | 956 | | |
951 | 957 | | |
952 | 958 | | |
| 959 | + | |
| 960 | + | |
| 961 | + | |
| 962 | + | |
| 963 | + | |
| 964 | + | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
| 988 | + | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
| 1000 | + | |
953 | 1001 | | |
954 | 1002 | | |
955 | 1003 | | |
| |||
0 commit comments