- The data annotation giant has posted nearly 60 listings across dozens of languages, suggesting a concerted push for more language data.
- Western languages like German command as much as 15 times more pay than languages from the Global South.
- Large language models like ChatGPT still struggle to operate in languages without sufficient training data available online.
Silicon Valley’s biggest artificial intelligence developers have a language problem. Generative AI tools, like ChatGPT, thrive in English and Spanish. But early research shows these same tools are chronically underperforming in “low-resource” languages that are less represented on the internet. Now, one of the biggest suppliers of training data seems to be tackling that problem head-on.
Scale AI, one of Silicon Valley’s most prominent training data companies, is currently hiring for nearly 60 contract writer roles across dozens of languages. Each job listing claims the work is for a project to train “generative artificial intelligence models to become better writers.” The languages include Hausa, Punjabi, Thai, Lithuanian, Persian, Xhosa, Catalan, and Zulu, among many others. Six job postings, under the category “experts,” are looking to hire writers specifically for regional South Asian languages, including aKannada, Gujarati, Urdu, and Telugu.
There are significant pay disparities between the languages, with Western languages commanding as much as 15 times more than those from the Global South. For example,...
Read Full Story:
https://news.google.com/rss/articles/CBMiP2h0dHBzOi8vcmVzdG9md29ybGQub3JnLzIw...