Request Pre-Training Dataset Ratio

I would like to understand the ratio of programming language (PL) to natural language (NL) in the pre-training datasets of the ​​codet5-base​​ and ​​codet5p-220m-bimodal​​ models, such as whether the ratio is PL:NL=2:1.