For robots, drones, or vehicles to achieve higher levels of autonomy, they need artificial intelligence based on reliable data. Companies working on machine learning projects must juggle research, development, analysis and other tasks connected with their core functions. Their in-house employees do not necessarily have the time for annotating data at the volumes required to train machine learning algorithms. Such work can also prove to be costly, since engineers and other team members tend to command a high rate of pay.
Why is data annotation important?
Properly annotated data is very important for the development of autonomous vehicles, computer vision for aerial drones, and many other AI and robotics applications.
Self-driving cars must be able to identify everything they might encounter on the road. Therefore, human data annotators need to label pedestrians, traffic signs, other vehicles, and many other items in millions of images for such cars to function safely and properly.
In precision agriculture, drones can help farmers identify poorly growing crops so that they can adjust applications of fertilizer, water, or pesticide before an entire harvest is lost. Computer vision has to be trained to identify fruits and vegetables, which can vary widely in shape and orientation, in different conditions for this to work.
Service robots and AI-powered assistants rely on natural language processing to understand what people are saying. This requires text annotation so that the machine learning algorithms can learn different types of sentence structures. However, it is the job of human data annotators to break the input text into smaller phrases for the computers to digest.
Since data annotation is very time-consuming, many firms outsource the task to service providers that possess the necessary staffing capacity to get everything done on time and within budget. In order to find a provider that fits your needs, here is a list of 10 data annotation companies currently operating in the U.S. market.
1. Amazon Mechanical Turk (MTurk)
Not surprisingly since it’s owned by Amazon, MTurk lets companies tap into a vast distributed workforce around the clock. Firms can use MTurk to hire individual workers to help them complete specific tasks for their machine learning projects. These are typically simple jobs like labeling images or transcribing text. MTurk is an ideal fit for small-scale projects that do not require huge volumes of data to be annotated, but rather for tasks to be completed quickly and cheaply.
- Amazon is one of the largest companies in the world
- Develops and implements new technologies in AI and machine learning
- MTurk allows for a wide talent pool to choose from
- Amazon Mturk is a crowdsourcing platform, which makes it hard to screen candidates
- Difficult to implement any kind of quality assurance (QA) processes
- Lack of personalization and customization to your unique needs and requirements
2. Mindy Support
Mindy Support is a recognized partner for several Fortune 500 and GAFAM companies (Google, Apple, Facebook, Amazon, and Microsoft), as well as busy start-ups worldwide. It provides a wide range of data annotation services as well as other business-process outsourcing (BPO) offerings.
Mindy Support is one of the largest BPO service providers in Eastern Europe. What separates it from MTurk is that it manages an entire project from start to finish, with a reliable multi-level QA process in place. With such a sizable workforce, Mindy Support can quickly scale up without compromising quality.
- More than seven years of experience meeting 100% of clients’ quality and accuracy requirements
- One of the largest BPO companies in Eastern Europe, with more than 2,000 employees in six locations
- Strong portfolio of customers, including several Fortune 500 and GAFAM companies
- No office in the U.S.
- Lack of brand awareness in some industries
- Time difference with the U.S.
Figure Eight, acquired last year by Appen, provides quality data annotation services using a distributed network of human annotators. It’s always a good idea to keep all annotators under one roof, since this facilitates better communication and helps everyone stay on the same page.
- High quality
- Experience working in 130 countries
- Expertise in 180 languages
- High costs
- Lack of communication due to many projects
- Tool security risks
Hive offers end-to-end solutions for data annotation, but its use cases suggest that it serves a limited number of industries. Furthermore, it is unclear if the company handles any healthcare or agriculture projects.
- Offer a wide range of services including app development
- Developed their own full-stack AI platform
- A lot of attention given to product development
- Not enough focus is given to data annotation
- Not enough employees to take on large projects
- If you are not happy with its platform, there have no other options
Playment offers various data annotation services, but it appears to be focused solely on the automotive industry. Still, many large companies trust the company, which provides a detailed description of the various data annotation projects it can manage. This is somewhat out of the ordinary, since few companies tend to elaborate on the types of data annotations they specialize in.
- Developed its own data annotation platform
- Support a wide variety of annotations
- Focus on autonomous driving
- No mention of industries beside automotive
- No flexibility in terms of tools and platforms
- It is unclear how many employees they have in house that can take on projects
Edgecase is one of the few companies on this list that focuses on sectors other than the automotive industry. The platform also has ties to university and industry experts, which helps boost its credibility and helps it stand out from the crowd.
- Automated and fast annotation
- Four offices around the world
- More than 3 million images generated every day
- Since Edgecase was founded only a couple of years ago, it does not yet have a lot of experience
- Low quality due to high speed
- Uses only its own tools and limited staff
Scale is an interesting company because it provides managed labeling service via an application programming interface (API). Many other firms are focused more on the human element, but Scale relies more on computers annotating the data. What’s more, it has a quality-control system — something to bear in mind if you’re looking to hire human data annotators.
- Technology does all of the data annotation
- Trusted by some of the largest names in the world
- Perform a wide range of annotation services
- If you are looking for human annotators, Scale does not offer this service
- Limited amount of industries served
- Cost savings for customers are unclear
8. Humans in the Loop
This company was founded three years ago specializing in data labeling services. Humans in the Loop also does a lot of work in the community and provide people in war-torn countries like Iraq, Turkey, and Syria with job opportunities. It’s always good to work with a company that is all about people and helping others.
- Provides work for 250 conflict-affected people
- Recognized as a Global Innovator by Expo 2020 Dubai’s Innovation Impact Grant
- Member of the EC Digital skills and jobs coalition
- Annotators work in countries with instability and security risks
- Company has only about three years of experience on the market
- 250 employees is really not enough to take on large projects
Clickworker prides itself on micro-task expertise. With this in mind, it’s unclear how it would handle a larger data annotation project required for training ML algorithms in autonomous vehicles or healthcare AI-related projects, for example.
- Crowdsourced model of operation
- Access to a large talent pool of candidates
- Can assemble a team quickly
- Do not mention the industries it serves or specializes in
- Crowdsourced workers can be troublesome, especially with sensitive information
- With clickworkers, you usually have to redo a lot of tasks, causing delays
Dbrain is a crowdsourcing platform that connects data scientists to annotated datasets. Of course, sometimes projects do not require a high skill set, and a person with lesser qualifications can perform the task. In any case, Dbrain is ideal if you need highly skilled and knowledgeable contractors annotating your data.
- Connects data scientists directly with data annotators
- Work with large companies in their respective industry
- Can expedite business processes
- No English website; must use Google Translate or a similar app
- No mention of how many workers they have
- Do not talk about the methodology or processes
If you are looking to start a data annotation project, define the most important qualities you are looking for in a partner. Outsourcing can allow developers to focus more on core operations, but you should contact several providers directly to choose the best one.
About the author
Maryna Ozhohanych is senior marketing manager at Mindy Support and has over a decade of marketing experience. She joined Mindy Support to increase the efficiency of deliverables in data annotation in machine learning and AI. Ozhohanych is dedicated to bringing value to the automotive, robotics, autonomous security, and other companies by ensuring the right type of data classification with proper targeting and labeling using the text, video, and image annotation for computer vision and machine learning.