Those files are converted from midis written by human. They are almost 100% correct despite rare rhythm errors caused by rubato. On the other hand, those files are largely limited in amount, comparatively unorganized and have confusing names.
These files are generated by
from printed music scores. There are mistakes in those musicxml files, so it
might be a better idea to use them in noise-robust ways, such as statistics and machine learning.
For pdf tested, the average recognition rate is 83.80% for PhotoScore and 89.25% for Sharpeye. Most errors are about rhythm. Ways could be used to improve the accuracy, like Multiple OMR powered by Victor Padilla, Alex McLean, Alan Marsden and Kia Ng. Average accuracy is obtained from their paper.