Monday, April 30, 2012

Germany hates YouTube and wants it to die

So here is an interesting story about a court decision in Germany that dictates that YouTube should "filter" all uploaded content to ensure that it does not infringe copyright, or pay royalties.

Yet another thing that sounds simple and sensible until you actually think about it for a minute.

The key word here is "filter".  Sounds easy enough...think Britta water filter, only for copyrighted content of all sorts.  All you need to do is have a programmer wave a magic wand and voila, filter created.

Unfortunately, programming something like that is not so simple as it sounds.  People who don't know how to program have an unfortunate tendency to believe that it is easy to do.  After all, it is easy for them to use a computer with a mouse and keyboard and web browser and surf the net.  What they don't realize is that the reason it is so easy for them to do that is because of literally decades worth of careful programming work.

For example, if I was to ask you how you would go about writing a program to filter copyrighted content, what would you say?  Come on, what's the problem?  Just make a filter!

Well, I've got news for you--it is an excruciatingly hard, if not utterly impossible, thing to do.

Look, what you need to understand is that computers are machines, and so are computer programs.  You have to start with facts which can be understood, and then write logic that actually does stuff with those facts which is comprehensible by the mind of man.  Let me show you how this works.

Mr. YouTube himself knocks on your office door and tells you that you have been tasked with writing the program which will filter all copyrighted content off YouTube, and it would be nice if you could have it done by COB.  So you need facts.  Well, logically, the facts you need will include:

1. A knowledge of all copyrighted media in the world--all movies and music in particular.  In total--you need to know about every fraction of every second of everything, in order that you can recognize a clip.

Think about it--someone uploads a clip.  You need to do a comparison.  What are you going to compare it against?  Obviously, everything that has ever been created in movies, TV shows, and music.

Some sub-problems of this are:

A. New content that didn't when you got started on the project.  You're going to need to update your database of all copyrighted media in every language from all over the world every day.

B. You need to do this using psychic powers, since not everything that is copyrighted is registered.  In the United States, at least, anything anyone makes (almost) is copyrighted by default, and you will need to divine the intent of the author on a regular basis.

C. A typical DVD is about 4 gigabytes of data (compressed).  A random google search tells me there have been 25,000 movies ever made, which I think is way low but lets run with it.  25,000 x 4 gigs is 100,000 gigabytes, or an exobyte of data.  An exobyte has been compared to all the words ever spoken.  You might be surprised to learn that you cannot buy an exobyte hard drive at MicroCenter, and that searching an exobyte of storage would take a little while.  As in days.

And that's just movies.

Next fact:

2. You will need to develop brand new computer vision and hearing technology so that you can make essentially "subjective" judgements about whether an upload matches one of the members of your huge database of all copyrighted media.  And oh, by the way, I'm sure the Department of Defense would appreciate it if you could share this groundbreaking technology with them to use in their killer robots, because that's the level of sophistication required here.  This technology does not actually exist (quite).

Bear in mind that simple checksums will not work even remotely reliably, as a simple re-encode will alter the checksum.

3. You've got to do it fast.  According to the linked article, users upload 60 hours of video content to YouTube every minute.  Even if you had a billion human beings with encyclopedic knowledge of copyrighted material in their heads, they could not filter all the content.  Add that to the exobyte database above and....

So.

Good luck with that.

I hope you can see now why this is a truly idiotic notion, that you can just "write a filter" to "filter out copyrighted material".  It requires knowledge and technique which does not exist--but other than that it is perfectly feasible.

So.  The alternative under these new rules imposed by a judge is either to close up shop, or for YouTube to pay royalties for everything their users upload.

You heard that, right?  YouTube (owned by Google) isn't schlonging all this copyrighted material onto YouTube for the enjoyment of the masses--it is uploaded by their millions of users, which YouTube are supposed to somehow control.

Is that right?  Should YouTube be liable for the actions of their users?  Think about it: 60 hours of video a second.  That is what YouTube and other file video file sharing sites have given us--a phenomenally powerful and complete library of human culture.

I can see why someone would want to destroy that.

Anyway.  The one good thing about this is it is Google they are going after, now--finally a non-defenseless victim.  That being said, the reason the suit is happening is doubtless because it is Google, because they have money, as opposed to suing all the small fry who actually do the uploading.

And to be further fair, it has to be said that YouTube definitely has ads on their site, and is thus quite commercial.  This is why Google/YouTube have sought cross marketing agreements with content owners where they split the ad revenue on videos with their content.  Not an unreasonable response, in my opinion.  But not good enough for GEMA (the German RIAA).  They want a cash payout or an impossible task.  Because, as usual, suing people is better than doing honest work.

I'm not sure a YouTube without a profit motive could exist, mind you.  But that's not my problem, to make Google money.  I'm just searching for what's right, and what is best for humanity's information network.



No comments:

Post a Comment