04/06/2026
๐ ๐ต๐ฎ๐ป๐ฑ ๐ซ-๐ฟ๐ฎ๐ ๐น๐ผ๐ผ๐ธ๐ ๐๐ถ๐บ๐ฝ๐น๐ฒ. ๐ง๐ฒ๐ฎ๐ฐ๐ต๐ถ๐ป๐ด ๐๐ ๐๐ผ ๐ฟ๐ฒ๐ฎ๐ฑ ๐ผ๐ป๐ฒ ๐ถ๐๐ป'๐.
During ML6's Christmas Projects Week, a team of four set out to build an end-to-end pipeline for scoring erosive hand osteoarthritis from X-rays, detecting each finger joint, scoring it against the clinical GUSS scale, and producing heatmaps showing where the model focused its attention.
One week later, the best model reached a fivefold improvement over the baseline on the metric that carries clinical meaning, not a deployable system, but a credible signal that the pipeline is learning something real about the disease.
The blog post is less about the final numbers and more about the engineering decisions behind them: the shortcuts the team deliberately avoided, the baselines that kept the evaluation honest, and the evaluation protocol anchored in how radiologists actually work.
Because in medical AI, "it works" is only the start of the conversation.
Read more: https://hubs.la/Q04k1W900