mailto: blog -at- heyrick -dot- eu

You are not reading my b.log using HTTPS. You can switch to HTTPS by clicking here.

AI and copyright

We live in a world where copyright is a thing and it is often aggressively persued. If I was to make a video of me narrating this, and you could hear any part of Peter Gabriel's "Four Kinds of Horses" in the background, my video - the primary part being my work and my creative output - could be subject to a takedown request, or be monetised, by whatever company owns the rights to the music. Even if the music just happened to be on in the background, where one might argue fair use applies.
I have practical experience of this - one of my Advent Calendar videos a while back had a song playing on my NetRadio as I was recording it, I actually panned over to show the LCD crediting the song and author, and... well... it's blocked in a bunch of countries. I think Germany is one?

So it is particularly galling to see OpenAI attempting to claim "fair use" for their mass pilfering of copyright materials to train ChatGPT without permission.
Google, having thoroughly abandoned the "don't be evil" slogan, think's it's perfectly fine for AI to mine anything and everything unless companies specifically opt out.
Thankfully there is some pushback starting on this side of the world.

Now, there is a long discussion about how copyright is fundamentally broken (in large part thanks to a certain rodent), but by and large people who create things freely (and not for pay) own the rights to their creations. If done for pay, it's usually the employer who owns the rights unless contractually stated otherwise.
Anyway, this crap that you are reading now is copyright to me. I credit myself as "Rick Murray" rather than my full legal name, as "Rick" is the name I go by (hence the "rick" in "heyrick", which is literally "Hey, Rick", in case it wasn't obvious). Plus, my middle name is something that my parents gave me that's been an annoying problem ever since despite me never using it.
But this is mostly a formality. With the exception of a certain somewhat corrupt regime <cough>America</cough> there is no need to register for copyright protection. It "just exists" as one of the provisions granted by the Berne Convention.

This means that I have the right to demand a takedown if somebody should copy my crap someplace else. I have the right to tell you how to use it. Which, of course, you are free to ignore should the terms be unfriendly, such as "you can only read this blog at the stroke of midnight while blasting Judas Priest as loudly as your equipment can manage". But, since it's copyright, if you don't follow the rules you don't get the right.
(and, yes, I'm quite aware that loads of people ignore copyright; but just like ignoring speed limits on roads, that's on them)

Now, there's an unwritten contract at play here. I create stuff, partly as my way of communicating with the world, partly in order to educate, partly in order to amuse, mostly to insult Tory scum, and a few other things in between.
If you take some of my ideas or source code for your own personal use, then that's fine by me. That's what it's there for. To help you (and for clarification, by "you" I am referring to an individual sentient being with a beating heart; not a "corporate personhood" or whatever other legal gibberish makes clearly non-people into people). If you improve upon what I do, it would be very nice if you feed your improvements back so that everybody can benefit - it's depressing how many people don't do this - so my MIDI module, extended and made much more useful by Dave and Ian was a breath of fresh air.
If, however, you take something of mine that I have spent time in creating (this rubbish doesn't write itself) and you find a way to make money from it, then let's talk licencing. And, no, I don't mean if you write a composition that uses my MIDI module that I expect to be paid, the module in that case is just a tool that you're using like RISC OS or the music software.
But if you directly profit from the use of one of my creations, then yes, it's a moral obligation. Or, to put it more bluntly, why should you be the one getting paid for my work?

Now, don't get me wrong here, about the only thing of mine I'm aware of being ripped off was my old ARM programming tutorials around the turn of the millennium (which popped up in a lot of obscure places in India and China, sometimes with my copyright notice stripped off, sometimes without). I was pleased that people were finding it useful, but annoyed that nobody had the decency to actually ask.

Now enter AI. Artificial Intelligence Idiocy. While laypeople have attributed all sorts of mystical powers to AI, it is really only seriously complicated pattern matching. To be sure, things like ChatGPT and Dall-E are impressive feats of coding, but that's only about half of the value of the company called OpenAI behind those AI systems. The other half of the value is in the training data, which in the case of OpenAI is not disclosed. This training data helps to make the AI much more capable, which in turn helps to make it much more valuable - both in the sense of usefulness, and also the sense of little floating currency symbols.
In essence, then, the AI bot scraping the internet is helping AI companies to make money, and they feel that they should have the right (as a fair use exemption) to be able to pillage anything and everything in sight - without even having the decency to ask first.

The reason this is a concern is because there is no actual "intelligence" in AI, despite what the name might imply. It's smoke and mirrors, for AI has no understanding. It's a massive system that builds a scatter chart of interconnected fragments so incredibly massive that I don't imagine anybody is actually capable of rationally auditing an AI model. Hell, I wonder if it's even able to be understood, never mind audited.
It can ingest, in the course of a day, millions of works - from blogs to source codes to books. It can tear them apart: analyse, organise, categorise, and rationalise, in order to plagarise and fantasise, with nobody to apologise ... unless perhaps somebody should dramatise, demonise, and criticise.
But, most of all, it will mix up and mash up all of this stuff that it has pilfered the web to find, eventually regurgitating some of it. With added hallucinations (perhaps faulty connections in the learning process?) just to make things more interesting.

 

If created works are "protected" (to varying degrees of success) by the concept of copyright, then this should remain intact, with no exemptions for AI. After all, if it should become possible for an AI to take our (not "my", "our") creations to train it, then why can't I take a series of five second snippets of different types of metal in order to "train biological intelligence"?

Aside: Trust me - Death Metal, Power Metal, and Symphonic Metal are only alike in having a chugging guitar bass line. It's the lowest pitched steel string...that's why it's called "Metal", folks! In terms of content and presentation, I would say that Death Metal and Symphonic Metal are otherwise about as closely related as Rock'n'Roll and Swing, that is to say "they're types of music".
Which means if I should mention listening to metal, please don't imagine that I'm a fan of Korn or Slipknot.
Now, if I could claim fair use and make a video outlining snippets of each type, you'd see what I mean. Aaaanywaaay....

 

Now, you might be thinking that it'll be fairly easy to extend a site's robots.txt file to add:

User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: CCbot
Disallow: /
User-agent: TurnitinBot
Disallow: /

However this is probably slamming the stable door shut after the horse has bolted, grown old, and died; plus it gives an unfair advantage to Microsoft and Google who likely use the results of their search engine crawler to feed into their AI system (thus, meaning to block Google you'd literally have to block Google with all of the side effects that this would entail); plus it's dead easy to work around by simply making an updated bot and giving it a new name that isn't on the list of disallowed.
It's also worth pointing out that robots.txt is a polite suggestion, not an actual requirement. A bot can choose to completely ignore it.

 

Why is it too late? Well, I asked ChatGPT this:

Please write an article in the style of Rick Murray from heyrick.eu

This was the reply from ChatGPT:

Heyrick.eu is known for its unique blend of technical insights and a touch of humor. Here's an article in the style of Rick Murray:

Title: The Curious Case of the Quantum Toaster: Is Breakfast in Another Dimension Just a Pop Away?

Greetings, intrepid readers of the digital realm! Today, we're embarking on a wild journey through the tantalizing world of quantum toasters. You heard me right, folks, we're about to take breakfast to a whole new level, and it's not just a pop-tart in another dimension!

So, you may be wondering, what in the multiverse is a quantum toaster, and why should you care? Well, my dear compatriots, strap in, because we're about to dive deep into the quantum realm, where toasters defy classical physics and toast with a flair that would make Schrödinger jealous.

The Toasting Tango of Quantum Mechanics

Picture this: You stumble into the kitchen, bleary-eyed and in desperate need of caffeine and carbohydrates. As you groggily reach for the toaster, you notice something odd. It's not just your regular toaster; it's a quantum toaster! What does that mean? Well, it means we're about to warp our minds like a bread slice in a wormhole.

In the quantum world, things don't play by the same rules. Instead of merely toasting your bread, a quantum toaster can simultaneously toast it and not toast it. Yes, you heard me right - Schrödinger's Toast! It's both crispy and not crispy until you open the toaster and observe your breakfast's fate.

Quantum Entanglement: When Bagels and Croissants Collide

Now, you might ask, "Rick, how can a toaster be in two states at once?" That's where quantum entanglement comes into play, my dear friends. In the quantum world, particles can become entangled, sharing properties regardless of distance. It's like your bagel and croissant discussing toasting preferences while sipping espresso in a distant café.

So, your quantum toaster's heating element becomes entangled with your bread, existing in multiple states simultaneously. Until you observe your breakfast, it remains in this superposition of toasted and untoasted, a cosmic breakfast paradox that'll leave your mind more scrambled than an egg.

The Quantum Breakfast Dilemma: When Toast and Toaster Are One

Now, here's the kicker, and I warn you, it's a mind-bender. When you finally pop that lever and take a peek, you collapse the quantum wavefunction, forcing your toaster and toast to decide their fate. It's as if the universe itself is asking, "Do you want your toast lightly browned or charred to oblivion?"

But wait, there's more! The quantum toaster can also exploit quantum tunneling, allowing it to toast your bread from one side without the need to flip it. You see, in the quantum realm, your toast can literally teleport to the desired crispiness, sparing you from the dreaded burnt toast catastrophe.

Quantum toaster
Quantum toaster: base image created by Dall-E2,
further tweaked by me using my phone's photo editor.

Conclusion: Breakfast Beyond the Boundaries

In the end, my fellow breakfast enthusiasts, the quantum toaster challenges our very understanding of the universe, breakfast, and the mysterious realm where they collide. It's a toaster that defies convention, toasts without toasting, and exists in a quantum dance of entangled bagels and croissants.

So, as you sip your morning coffee and ponder the mysteries of the quantum toaster, remember that breakfast may never be the same again. Whether it's a scientific breakthrough or just a whimsical tale, one thing's for sure – your next slice of toast may be from a dimension where jam spreads itself!

Until next time, keep your toasters toasty and your minds quantumly perplexed. This is Rick Murray, signing off from the breakfast table of the multiverse.

 

The above italicised text was AI generated using ChatGPT.

Clearly ChatGPT already knows about me and my blog and the sorts of things that crop up. It isn't perfect - it utterly missed my tea obsession, so I don't think I'd ever write "sipping espresso" unless it was dripping sarcasm, but...

I'm quite upset actually. That's better quality crap than a lot of the crap I write. It's taken me an hour and a half (and two cups of tea) to write this. Maybe I should just give up and get ChatGPT to do it from now on?

 

That's all for now. The grass isn't going to mow itself...

 

 

Your comments:

Please note that while I check this page every so often, I am not able to control what users write; therefore I disclaim all liability for unpleasant and/or infringing and/or defamatory material. Undesired content will be removed as soon as it is noticed. By leaving a comment, you agree not to post material that is illegal or in bad taste, and you should be aware that the time and your IP address are both recorded, should it be necessary to find out who you are. Oh, and don't bother trying to inline HTML. I'm not that stupid! ☺ ADDING COMMENTS DOES NOT WORK IF READING TRANSLATED VERSIONS.
 
You can now follow comment additions with the comment RSS feed. This is distinct from the b.log RSS feed, so you can subscribe to one or both as you wish.

David Pilling, 2nd September 2023, 13:48
"If you steal from one author it's plagiarism; if you steal from many it's research." 
 
The ChatGPT stuff above is good, but not like your own words. AI stuff comes up with some ideas but is not the last word. 
 
I played with AI and have lost interest. For factual stuff it was not trustworthy - it invented 'facts' that were so plausible, hard to disprove but wrong. 
 
I would use it for brainstorming. 
 
There was a time when at school writing an essay consisted of marking bits of books in pencil and then copying them out. This was encouraged by teachers and it came as a big disappointment when the wind changed (probably at Uni) and this technique was frowned upon. 
 
The trouble is, if you read 20 books on ARM coding and then write your own book, that's OK, but most of your book is from what you've learnt from the others. 
 
Point being acceptable human behaviour is often not much different to copying. All these 'words' you use, did you make them up yourself. 
 
As an academic I would scour the literature and reference any idea I used. The internet of people is not like that. 
 
TBH the academic world is not like that either, never unrestrained joy when you point out that something someone has worked on for two years was done previously by someone else. 
Anon, 2nd September 2023, 14:44
Rick - that AI article didn't read like anything you wrote. It actually read more like an un-filmed episode of Red Dwarf. 
 
I have just one more question... 
 
Would you like some toast?
Qanon, 2nd September 2023, 17:18
OTOH that stuff about quantum toast is amazing considering that Google can't find anything like it.
Zerosquare, 2nd September 2023, 21:24
That AI-generated article didn't even mention RISC OS, the Eurovision song contest, or British politics once. 
 
I think you're not likely to be replaced by ChatGPT anytime soon.
Rick, 3rd September 2023, 17:43
David: I can't believe that it was actually taught that writing an essay was copying stuff from books. 
 
In my day (late 80s), the important part of writing an essay is paraphrasing. Sure, read all the books necessary, but the result *must* be written in your own words, because only then can you demonstrate understanding. 
If you just copy stuff, doesn't mean you understood any of it. 
 
A lot of human endeavours are ultimately copying things. But it isn't a *direct* copy. Maybe when I use SimpleSeq to write something, it'll sound a bit like something somebody else has written. This isn't because I'm plagiarising, more likely because I sort of remember something like that which I enjoyed listening to, and thus wanted to make something like it. 
 
I'd say, unless one is involved in original research, pretty much everything we know has been learnt from others. But, again, there's the human element of taking all of these things we have assimilated in our lives and offering our own perspective. My style is quite chatty and prone to tangents, as I'm sure you've noticed. 
 
My SimpleSeq software, entirely original code written from the ground up. But the basic concept isn't mine. MIDI sequencers like that have existed as long as MIDI has, and indeed it's no stretch of the imagination to think to put notes into boxes arranged by time. Hell, if you switch the axis and put time down the screen you're halfway to a late 80s tracker editor (only without the really grody colours). 
David Pilling, 3rd September 2023, 20:21
The Civics beak only had one year left, lessons consisted of copying a couple of pages from a book, then when you finished you could muck about. 
In O level Religious Knowledge, it was all dictation, for hours the class would write whilst the RK beak read from a book. Problem for me since I was the slowest writer and set the pace for the rest of the class. 
Thus was education subverted. 
Perhaps being a scribe was still a thing. 
At any rate selecting which bits of which books you were going to copy was a step forward. 
Rick, 3rd September 2023, 20:42
Religious Knowledge... oxymoron.

Add a comment (v0.11) [help?] . . . try the comment feed!
Your name
Your email (optional)
Validation Are you real? Please type 91367 backwards.
Your comment
French flagSpanish flagJapanese flag
Calendar
«   September 2023   »
MonTueWedThuFriSatSun
    3
67910
1517
24
25262729 

(Felicity? Marte? Find out!)

Last 5 entries

List all b.log entries

Return to the site index

Geekery

Search

Search Rick's b.log!

PS: Don't try to be clever.
It's a simple substring match.

Etc...

Last read at 09:03 on 2024/05/02.

QR code


Valid HTML 4.01 Transitional
Valid CSS
Valid RSS 2.0

 

© 2023 Rick Murray
This web page is licenced for your personal, private, non-commercial use only. No automated processing by advertising systems is permitted.
RIPA notice: No consent is given for interception of page transmission.

 

Have you noticed the watermarks on pictures?
Next entry - 2023/09/04
Return to top of page