Комментарии:
good test, we hope to improve the model shortly
ОтветитьWhich is better at coding, 1206 or flash thinking? I'm not too sure yet about the superiority of CoT for coding in general.
ОтветитьNext when you test these models give it an image of an economic calendar from FXStreet and ask it to count the number of Red Bars and Orange Bars in the screenshot or Reading & extracting data from a screenshot of a Rogers and Mayhew Steam Table. I think image analysis is still a critical area these models need to work on.
Ответить