ChatGPT shows better moral judgment than a college undergrad

Renneder@sh.itjust.works · 5 months ago

ChatGPT shows better moral judgment than a college undergrad

casual_turtle_stew_enjoyer@sh.itjust.works · 5 months ago

Please expound upon your qualitative assessment of what you believe to be the most capable publicly available model.

Naz@sh.itjust.works · 5 months ago

Llama-3 (Open Source) at 70B is pretty capable if you can manage to run it. I’d say it’s comparable to GPT-4, or maybe GPT 3.5.

In second place is WizardLM-2, at 8B parameters (if you are memory constrained).

You should run the largest model that you can fit completely in VRAM for maximum speed. Higher precision is better, FP32>16>8>4>2. 8-bit is probably more than enough for most consumer/local LLM applications/deployments, and 4-bit if you want to experiment with size vs accuracy.

LLM Arena is a good place to benchmark the different models on a personal A/B basis, everyone has different needs and personal needs for what different models can do, from help with coding, translation, medical diagnoses, and so on.

They all have various strengths and weaknesses presently, as optimizing a model for a specific domain or task seems to (not guaranteed, but only seems to) make it weaker in doing other tasks.