Typingpool: Human Audio Transcription Parallelism 41
theodp writes "Silly rabbit, parallel processing is not just for Big Data! Building on techniques outlined by Andy Baio back in 2008, Wired writer and 20% Doctrine evangelist Ryan Tate has released Ruby-based software called Typingpool to make audio transcriptions easier and cheaper. 'Typingpool chops your audio into small bits and routes them to the labor marketplace Mechanical Turk,' Tate explains to his reporter pals, 'where workers transcribe the bits in parallel. This produces transcripts much faster than any lone transcriber for as little one-eighth what you pay a transcription service. Better still, workers keep 91 percent of the money you spend.' Remember to Use the Force for Good, Tate adds."
Excuse Me While... I Kiss This Guy. (Score:5, Insightful)
Training (Score:5, Interesting)
The biggest market for audio transcription I'm aware of is in medical transcription, followed by legal. Many of the terms used are not normal english; In fact without a basic understanding of medicine, you could easily confuse one thing for another, with potentially tragic results. This is fine for everyday english, but not in industries where terminology is used that isn't. And that's a lot of it. This would be more useful for something like Siri -- no doubt Apple has humans to process some of the unknowns in the background, and would find a service like this useful, if they don't have something similar already.
Re: (Score:2)
This would be more useful for something like Siri -- no doubt Apple has humans to process some of the unknowns in the background
Not so sure about that [youtube.com].
Re: (Score:2)
To be fair it took me several times before I could figure out what he was saying. Radiolingua really needs to create a "Coffee Break Scottish." This might even need to be a prerequisite to their other instructional series.
Re: (Score:2)
Really? Interesting. It was apparent on the very first attempt, when his accent was at its broadest.
He needs to get himself out of Scotland though. An accent like that'll win him popularity across the world.
Re:Training (Score:4, Informative)
I suspect distributing even small, redacted portions of a medical or legal dictation would violate the many confidentiality laws in force in these industries.
I'm a sound editor and from time to time I've toyed with sending certain extremely cretinous jobs to Mechanical Turk, things like cutting silence out of audio recordings (can't always automate this), identifying and synchronizing numerous takes, or going through a scene frame by frame and identifying every frame with a gunshot. It's technically possible but if your project is anything more complicated than the tiniest FunnyOrDie video you're going to be breaching the producer's confidence.
As information technology makes things like Mechanical Turk easier to implement, it makes the information you would send to MT all the more valuable and dangerous to release.
Re: (Score:2)
That may be the biggest market right now, but that doesn't mean there aren't significant areas where this may be useful. Independent podcasts would be a great one -- currently most of those don't have any transcriptions available, but there are a number that I personally would follow if I could read them instead of listening. Conferences are another great one. Generally you'll get partial transcriptions of keynote speeches and that's about all they can do. A service like this might make full transcriptions
Comment removed (Score:5, Interesting)
Re: (Score:2)
I don't see how you made $10/hour. The assignments that I got were only a few cents. You say $20 would pay for your expenses for the day? I thought that room and board would be more than that.
When I did it, I made about $1/hour, and sometimes less, because the voices were garbled, and sometimes my work would be rejected. It was silly.
Re: (Score:1)
I forgot to add that I don't like what the summary said. 1/8 of the wages is not much at all. If it was $8 before, then the new price is $1. Being able to keep 90% of that is $.90. I'd rather have 50% of $8.
Maybe I misunderstand the summary.
Amazon Charges 10% Fee, Worker Gets 90%+ (Score:2)
How are the fees calculated? [mturk.com]: "Amazon Mechanical Turk collects a 10% commission on top of the reward amount you set for Workers. For example, if a HIT reward is set to $0.20, Amazon Mechanical Turk collects $0.02 for each assignment." So, the worker gets 91% of the total amount paid ($0.20/($0.20+$0.02).
Re: (Score:1)
That part I understand. It's the lower price that bugs me. From the summary, I get the impression that they had to pay $10, for example, for an ordinary contract, but now, they can pay $1. In other words, it sounds like a worse deal for the labour force. The percentages might be better, but the total money at the end of the day is worse, if I understand correctly.
Re: (Score:2)
I'm pretty sure that whoever creates the HITs assigns their value. Market forces essentially dictate the price.
If somebody wants to pay you $20/hr they can, or if they want to pay $2/hr they can do that instead. If they offer too little then nobody will work on them, in theory. However, it is a global market.
Re: (Score:3)
Re: (Score:1)
No, I haven't been to those countries. Thanks for clarifying. :^D I've been to China, though, so I should have known. I think that part of my problem is that I did most of my travelling in Canada.
Re: (Score:2)
You say $20 would pay for your expenses for the day? I thought that room and board would be more than that.
It depends on where you live; it should be enough in cheap countries, like Cambodia.
Re: (Score:2)
Five or six years, I transcribed podcasts for Mechanical Turk. Their audio files were already split into shorter passages (3 or 5 minutes, for example). Split them even further, and the transcriber might miss out on the context, which is often vital to knowing what exactly is being jabbered about.
That said, I'm not sure who is the target demographic for this kind of work anymore. Many of the podcasts were on subjects of interest to me, and I was getting about $10/hour from Mechanical Turk, which wasn't bad considering that I was often doing the work from backpacker beach havens around the globe where a couple of hours of work would pay all my expenses for the day. But the last time I had a look at Mechanical Turk, the amount they pay had been heavily reduced. Who wants it now? Even if you are in a cheap third world country, if you have the English skills for such transcriptions, you can surely find better and more dependable work elsewhere.
Where can I find these backpacker beach havens? Seriously!
Re: (Score:2)
Much about anywhere, really.
Stay clear of major tourist centers, mix with locals extensively, and don't hesitate to ask if the latter have a room available for a few nights. Nowadays it's also easy to find a cheap bed through couch-surfing websites, short-term flatsharing websites, and Facebook.
Re: (Score:2)
No, you mean Eduardo Saverin, Traitor to the US. (Score:2)
He thought that he was too good for US citizenship.
Perhaps the US should pursue on that additional $62 million by returning him and said assets to US jurisdiction. In this case, the military and intelligence used to effect his return would be a force used for good.[
That's how anime fansubs are done (Score:5, Interesting)
The volunteer process by which Japanese anime is subtitled within hours of release works a lot like this.
Re: (Score:1)
And it's how we end up with mass naked child events.
Re: (Score:2)
More details please? Would love to see translations for some of my other favorite Japanese shows
It's different for different groups. Many of the groups have or have had forums where they would announce releases and talk about whatever. A bunch of people capable of transcribing each take a section of the episode and typically do a transliteration, then often one person produces a translation, then one or more people time the subs which doesn't really take all that long and then they encode the video and then they upload to their distribution point and you know how it works from there I assume.
Subbing u
Now we have one less job category (Score:2)
And a new group of people expected to go on unemployment. Good job.
Sounds like the Human Computers of WWII (Score:2)
Human Computers [nasa.gov] describes the (mostly) women who did the mathematical calculations the engineers handed them to do for the war effort, freeing up valuable engineering time.
The article doesn't say how much parallelism there was in these "human computer pools" but I suspect there was a lot of it.
I guess it goes to show that even today, some problems are, for the moment at least, done cheaper, faster, and/or better (pick any two) with a person than an electronic computer.
See also: Logopolis [wikipedia.org], a Dr. Who serial f
It already starts in bad territory (Score:2)
Like any tool, Typingpool could probably be used that way. Please don’t use it that way!
You're using a service that generally does not attract people from the developed (First World) nations. It *is* being Used That Way.
Typingpool defaults to paying $0.75 a minute, and I often offer $1.00/minute, which produces transcripts very quickly, tends to attract better workers and is still roughly half the best rate I’ve seen for high-speed professional transcription. I have had success completing transcripts at lower rates, and Baio four years ago was able to find plenty of workers at $0.40/minute. But lower prices generally translate to slower transcription and lower quality.
You get what you pay for, but the service would have to have a builtin "No Third World" option defaulted to "on" for it to work well.
Also, bear in mind that just because you pay Mechanical Turk workers a lower rate through Typingpool than you’d pay a service doesn’t mean the workers are actually getting paid less or are worse off
However, the likeliness favors the conclusion that they are getting paid less/worse off.