Commentary

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

November 9, 2024 - LLM, Code Generation
2 mins

A completely open source code generation LLM family. Their paper provides performance results against HumanEval, MBPP, BigCodeBench and LiveCodeBench (mentioned earlier in this stream). Qwen seems to be the best performer, but having access to the training data and the ability to reproduce the results is a big improvement over other open source models.

OpenCoder is an open and reproducible code LLM family which includes 1.5B and 8B base and chat models, supporting both English and Chinese languages. Starting from scratch, OpenCoder is trained on 2.5 trillion tokens composed of 90% raw code and 10% code-related web data, reaching the performance of top-tier code LLMs. We provide not only model weights and inference code, but also the reproducible training data, the complete data processing pipeline, rigorous experimental ablation results, and detailed training protocols

For Java, the 1.5B model is on par with Qwen, but the 8B model is a bit behind.

I tested the 8B model (including Q6_K, Q8_0 and F16 variants), and while it gave a workable (but not great) answer to the prompt “write a Java function to connect to Aeron and send Hello World over a publication”, on one occasion it added on a whole lot of additional complaints about PRAM and SMC resets on macs running El Capitan 10.11.6 and finished off with a recommendation to upgrade my macOS to the latest version. That amount of hallucination was a first for me.