I was working on MLDataLabeler batches when I tried to accept a HIT and it said it requires qualifications. I looked at my quals and found this: Has anyone gotten this? Has anyone tried to message them/do they respond? Their directions don't always explain everything, and some of the examples they provide don't cover every image. And sometimes the directions conflict with the examples.
They're known for taking quals and blocking, even blocking good workers, which is a head scratcher for me. Evidently they don't like us doing too many of their HITs, I don't know what they think we're here for. I want all the money. This one cut me - check out the worker reviews: https://turkerview.com/requesters/A29SD5N4EEZ25M-mldatalabeler I did over 800 of their "Shoe aesthetics" HITs over a couple of days, they changed my qual value to 3 so not only could I no longer do the HITs, I couldn't even see them. I received an email: Greetings from Amazon Mechanical Turk, One of your Qualifications has been revoked by the Requester who maintains the Qualification. Please see details about the Qualification below. Qualification Revoked: Shoe aesthetics Requester: MLDataLabeler Grant Date: November 27, 2020 Revoke Date: December 1, 2020 Revoke Reason: You are now eligible to accept additional “Assessment of Shoe Aesthetics” tasks. This qualification is only being used to control the number of tasks an individual Worker can complete. The qualification will be reapplied when your task limit is met. It is not a measure of work quality and does not affect your eligibility for any other MLDataLabeler tasks.
I got it last year and tried to contact them but I didn't get any response. Lots of people also got banned without reason around that time so I think that it was a poorly programmed batch. If they make mistakes writing the instructions, why wouldn't they also make mistakes writing their algorithms? It's very likely that some batches are programmed poorly, making the algorithm detect correct answers as wrong and then banning lots of people without reason.
I disagree with your conclusion. The purpose of creating a dataset is so that the system can learn. There would be no reason to further continue training in basketball detection if the system can correctly identify submissions as having a basketball correctly bounded (for example). Additionally, their purpose is not to develop a system that can detect accuracy in submissions. It is far more likely they have a system of reference that is much more concrete. Something as simple as asking the same question (or showing the same image) every 75 submissions and then calculating the degree of variance among the user's submission history.