Legal LLM reasoning

As a project, I want to build a legal reasoning model that can give a decision after receiving the case. I have half million court decisions. In these decisions, first the case is described, then related intermediary law articles are given for proving final decision, and at the end there is a final decision. However, I have some questions about its implementation. What do you think should I fine-tune the model with decisions and legal corpora, or would it be better use reinforcement learning algorithms (such as GRPO, etc). If I use RL, again there are few considerations such as how to train the reward model?

submitted by /u/redd1t_use
[link] [comments]

Liked Liked