The job of a freelance transcriptionist is to convert audio or video into text at the request of a client. This is usually a service that requires time, concentration, good spelling, a good ear and a lot of patience; Therefore, budgeting correctly for audio to text transcription services is very important if you want to be successful in this field of freelance work.

Learning how to set your rates for transcribing texts from audios is an essential task in this activity, since in reality it is not as simple as it may seem at first.

First of all:

This is a job that is charged per minute of transcribed text and the price of each minute depends on several factors. On the other hand, there are methods and tools that can help you facilitate this work so that it is profitable.

In this article we will explain how to set your rates and how much to charge for transcribing audio to text per minute. We will also give you some advice on the tools and methods that can help you in this task.

The types of audio or videos that clients generally seek to transcribe include recordings of classes, conferences, interviews, testimonials, videos to which they wish to add subtitles, among others.

Why does someone hire transcription services if there are software that transcribe?

One of the reasons why customers want to hire audio to text transcription service is because artificial intelligence or transcription softwares are not always accurate. Neither with punctuation nor with words. Simply put, in communicating through verbal language, artificial intelligence is far from achieving what the human brain can do.

Especially if the service is provided in Spanish or another language other than English, the options are usually more limited and imprecise.

The software is still unable to discriminate fillers, errors and repetitions in the audio to be transcribed. They are also not capable of summarizing if a word-for-word transcription is not desired. Furthermore, if the speaker pronounces too quickly or does not have good diction, the result will be a text that does not make much sense.

This is why clients require transcription service, especially when it comes to important jobs. And that is the role of the freelance transcriptionist: Deliver well-done, logical, well-punctuated and quality work.

Aspects to evaluate when establishing your rates for transcription of texts

Beyond charging per transcription minute you need to take into account some aspects, for example:

  • Transcription type
  • Audio quality
  • Speaker’s speech rate
  • Number of interlocutors
  • accent or region
  • Topic addressed
  • Number of audios to transcribe
  • Help Tools

You can set different rates taking these factors into account. This means that you can start with a basic rate, establishing standard quality conditions that the text must have. Then, you can specify what factors will increase the price or under what conditions the rate you are giving as a basic one no longer applies.

What types of transcription exist?

This refers to the accuracy of the transcription depending on the purpose for which it is to be used. It can be identified in several types.

Literal transcription:

It is the type of transcription in which everything you hear is converted to text without summarizing or omitting any words. This usually also includes interrupted words, repeated fillers, doubts, brief interruptions from other interlocutors, background comments, among others. What is the type of transcription that is requested in testimonial evidence, trials and other requirements of this type.

Verbatim transcription is usually more expensive because it takes longer. Additionally, sometimes you need context indicators. For example, indicate who the voice belongs to, if a word is not intelligible, etc.

Natural transcription:

This is more frequently requested and includes the texts naturally, but omitting irrelevant information such as cut words, interlocutors who interrupt and whose contribution is not part of the message, or others.

It is one where the text is e

Edited transcript:

dited and formalized to improve readability and facilitate understanding. Grammatical and verbal errors are corrected, incomprehensible jargon is edited, parts that do not contribute to the central message are omitted, and a grammatical correction is also made. On the other hand, repeated words, fillers, broken words, among others, are usually omitted. This type of transcription is highly requested when it comes to forums or class recordings. It can also be useful for media publications of interviews conducted verbally.

In this type of transcription, pauses, noises in general and any irrelevant topic words or sounds are removed. It requires judgment, attention and additional work of proofreading and editing texts.

There are other types of transcriptions such as phonetic transcription, but this is a much more advanced area in which knowledge of linguistics and the International Phonetic Alphabet (IPA, better known as IPA) is needed. In addition, it can be linked to other areas such as translation. Therefore, we will not delve into it on this occasion.

Audio quality

This refers to the clarity with which the speaker’s message is heard. Obviously, if the recording is made up close, by a person who speaks without interruptions or clearly and at medium speed, it will be much easier to transcribe what the audio says.

On the other hand, if there is background noise or frequent interruptions from other interlocutors, if it is taken from a place far from the speaker or the source of the sound it will be more difficult to hear clearly and therefore it will be more difficult to transcribe.

Speaker’s speech rate

This is another factor to observe, since the speed at which each person speaks is different. It is advisable to take a few sections of the audio at random and review how many words the speaker says within a given period.

It is not the same that for each minute you must transcribe 120 or 125 words per minute of audio than transcribe 180 or 190.

This is the middle range at which people usually speak and is not so important when speaking, but it makes a big difference when transcribing what was said.

Depending on the words per minute estimate, the rate may be higher. You can take a sample minute from the middle of the recording, at random, and try transcribing it. If you don’t have the opportunity to do this, ask the client if the speaker in the audio speaks quickly or slowly.

The client generally has no problem answering these questions. If he tells you that he talks pretty fast, be prepared because the job will require more effort. You will have to listen more carefully and it will take longer to transcribe each minute.

Number of interlocutors

This aspect must be taken into account when it comes to classes or interviews, since, generally, in a class there is usually interaction between interlocutors. something very different, so it is a presentation or instructional video. Or if it is, for example, a transcription of a video for a YouTube channel, where there is only one speaker.

On the other hand, in a class there are usually more interlocutors than in an interview, but their interventions may occur less frequently than in the latter.

In any case, when there is an exchange of speakers, you should surely indicate it within the text, whether or not the name of the person is specified.

Therefore, when there is usually more than one interlocutor, the rate can rise from 5 to 10 cents per minute of audio.

accent or region

Accent or region may be important depending on geographic distance, as well as the speaker’s clarity or neutrality in speaking.

For example, if you live in Mexico and are transcribing an audio in Spanish ( Spain ), you may need to pay more attention to the words. Sometimes you will have to listen twice or slow down to understand the audio properly.

You should also look for some references on the internet for terms that you don’t know how to spell.

On the other hand, accents in the same country tend to vary by geographical area, which can make it difficult to understand if the person belongs to a rural area or an area full of localisms.

This is, therefore, another factor that can cause the rate per minute to increase.

Topic addressed and terminology

In addition to the above, transcription can be difficult if the topic discussed in the audio is of a higher technical or academic level. There may also be a level of difficulty if it contains elements of another language such as Anglicisms, too much slang , professional jargon, frequent mention of acronyms, among others.

When it is a simple topic you can leave the rate as you have it established in a basic way. But if it is a more complex issue, you will have to increase the value, since you will have to pay more attention to the work and it will surely require more effort on your part.

Amount of audio to transcribe

When the audio is too long, the client will surely want you to have some consideration about the price for the volume of work, and you certainly can have that. But it is better to apply a fixed amount of discount to the final price than to reduce the per minute rate based on the length of the audio.

For example, if you have set a rate of $1.75 per minute and the audio is 6 hours (360 minutes), this will give you a total value of $630. It is better to tell the client that you will do all the work for $600 than, for example, to reduce your rate to $1.50 per minute, which would net you $540. Quite a bit less money. 

The client will appreciate that you have made a consideration, but the discount will be less in the first case. And ultimately, the discount should actually be about that: A consideration for the volume of work and not about lowering your fees or what the work is really worth.

Help Tools

This refers to what resources you will need to transcribe audio. Surely, many have fallen into the mistake of thinking that it is only about “listening and putting in what you hear”, singing and sewing…

But in reality, sometimes they can deliver audio to you in formats for which you do not have a program to play and you will have to convert them before starting to work. If the audio is very long, the conversion tool is probably not free, or you will have to divide the audio into sections, which will have extra work.

On the other hand, if the turnaround time is very close, but the audio is quite clear, you may want to use an automatic transcriber and then correct over the generated text. These tools are rarely free and I can understand anywhere from $0.10 to $1.00 or more per minute. Depending on the tool you need to use, you will have an extra cost. Keep this in mind when setting the per-minute rate for your client.

For example, if your established rate is $2.00 but you will need a tool of this type and the value is $0.10 per minute, you must change your rate by adding this amount per minute. Never skip this cost.

How much to charge per minute of transcription?

Generally, a text transcription can start at around $1.75 per minute when dealing with a single interlocutor in your regional language and with natural transcription. From there you can add additional fractions for each difficulty factor or element. The value for a high difficulty transcript can even be around $3.50 or more.

You can start by establishing a basic rate and detail, according to the factors previously seen, how much money will be recharged per minute with each extra difficulty.


Basic text:

  • 1 speaker
  • 125 to 130 words per minute
  • Spanish (regional)
  • Clear audio
  • Non-technical topic
  • natural transcription

$1.75 per minute

+ 1 speaker or interlocutor: $0.05 extra per minute

+ Spanish outside of the X region: $0.05 extra per minute

+ Verbatim transcription: $0.10 extra per minute

Final rate: $1.95 per minute

After having detailed this breakdown, add a multiplication of the rate by the total minutes to be transcribed. If you wish, after that you can add the discount amount.

Example: $1.95 x 360 minutes (6 hours) = $702

Before setting the delivery time, go back to the words per minute test you took and see how long it takes you to transcribe a single minute. Keep in mind that sometimes you will have to go back more than once to listen to the audio again.

Also take into account the hours you will have available to do the work. Having the day off to do it is not the same as, for example, doing it at night after your regular daily work or classes.

Also take into account the delivery time to make an extra surcharge for urgency to the client, as is normal to do in any freelance job.

Make any adjustments you consider necessary to your rates, and as we always recommend, review them regularly.

How can I work as a transcriptionist?

Now that you know what factors to take into account when stipulating the price of your per-minute rates, let’s look at some final tips that will help you make the transcription job easier.

Find a quiet and quiet place

Although it may seem obvious to say it, the first point to keep in mind is to be in a quiet place, where you run little risk of being interrupted when working. Eliminate factors such as pets, background music, noise from children playing, sounds coming from the street, among others.  

Comfortable seat and correct posture

Try to find a place where you have enough natural light and you can sit in a comfortable position. After a couple of hours of sitting transcribing you will appreciate having an ergonomic chair and a stable position.

Good quality headphones

This is a key technical factor for every freelancer who works in text transcription. Have headphones that work and in which the message can be heard clearly. The cable and plug of the headphones must be in excellent condition so as not to generate noise in what you hear.

No need to invest in expensive equipment. mid-range headphones will work well if you take care of them and work in a quiet place, as we have previously said.

Quickly go back or forward seconds function

Choose an audio player that has the function of easily moving forward or backward a few seconds. This will be very useful when you need to listen to something again that you couldn’t hear well, or if you need to get up for a moment and then pick up where you left off. This is much easier than dragging by hand and much more precise.

The Windows 11 music player has this feature, but there are sure to be many other options on the market.

Reduce speed without reducing pitch

This is a function that can be useful when the speaker speaks very clearly and is close to the recording source.

Windows AutoPlayer, VLC Player, and others have this feature.

It can help you if you already have a paragraph written, but you want to listen to it again, but more slowly because you are not sure about something or at the time of the final review.

Function two windows on the same screen

This function will be particularly useful if you need to search the internet for technical terms. You can have your text documents on the left and your search engine on the right to find the information you need. Although it may seem insignificant, you will save a lot of time by not having to exchange windows through minimize. This is a way to get the job done faster and with less effort.

Pause and play functions of your keyboard

Just like the double window function, having a keyboard from which you can pause or play without having to use the trackpad or mouse will save you a lot of time when moving forward with your work.

If you do not have these functions on your keyboard, a tool that can help you is oTranscribe, where you can upload your audio file and use the escape key to pause or play while you are writing any sheet that the same tool will show you. Then you can export it in notepad format (txt) or copy and paste it into your word processor.

What automatic transcription programs or applications are there?

When the text is too long and you have a short delivery time, but the audio is clear, you can use a transcription tool so you don’t start the work from scratch, but rather have a base from which to start. There are some solutions on the market for this type of need.

Although they are not 100% precise that you will have to listen to the text from the beginning, correct wrong words or punctuation, you will appreciate the time you will save by taking this previous step when the audio is long or you have many files to transcribe.

Some of these solutions are tools like Amberscript, Happy Transcribe, and Gglot. The way they work is that you upload the audio into the site, hit Convert, wait a few minutes, and download it as a text document.

Please note that these tools are paid. Also, remember that this type of tools is not usually so precise, especially in Spanish. Therefore, a thorough review will always be necessary, the advantage is that you can do it faster than if you start from scratch.

Word Spelling and Grammar Check

Once you have the text transcribed, you can use Word’s spelling and grammar check, or whatever word processor you use, to see if you haven’t made any inadvertent errors.

Also, if it is an edited transcript you can use these recommendations to improve the grammar after you have personally reviewed it.

Very important: The automatic review process of your word processor does not exempt you from manual review, which you must do if you want to deliver a good job.

Where do I find work as an audio-to-text transcriptionist?

As you have seen, the work of a freelance transcriptionist requires concentration, quality and knowledge of the tools used, it is also a job that takes time and in which you have to take into account technicalities that at first glance seem irrelevant but end up complicating the process. work so setting your rates as a professional transcriptionist is something you must do very carefully to make the work and effort worthwhile.

