文章

时政

【提问】如何评价Google的LAMB优化器？

图书管理员

霏艺Faye · 2020年9月21日图书管理员

另外，如何评价 AdamW优化器？

另外，如何评价gelu激活函数？

如何评价 new gelu激活函数呢？

thphd 2047前站长

On LAMB

At first, they thought all gradients are equal, so SGD should work for everything.

Then they realize some gradients are different, and there needs to be some sort of adaptation, like rmsprop/adagrad/adam.

After some more time, they realize that the variation in gradients cannot simply be characterized by a scalar/ bunch of scalars. the degree of adaptation needs to catch up with the degrees of variation. more sophisticated adaptation schemes were developed: normalization, feedback control, and so on.

If we go down this path we're likely to end up with network topologies where feedback/normalization mechanisms are distributed among the massive number of weights, each taking care of the few weights around it. much like a mammal brain.

2020年9月21日 /p/37308

菜单
图书管理员

霏艺Faye 图书管理员

@thphd #15838589 我在做SEO

这样，别人搜索关键字的时候，Google就会显示2047了

会搜这些关键字的华人，文化素质会比较高。。。

另外，我需要一个文化水平比较高的地方谈些学术的东西

2020年9月22日 /p/37403

菜单