There have been numerous discussions and posts about chatGPT’s memory and it’s capabilities to remember things from large inputs using prompt engineering. But does it really work? Let’s have a look at it.
Many of us have struggled trying to import large documents to chatGPT for whatever reason. Sometimes chatGPT complains about the length and we try other clever things with prompts like:
I will give you several inputs. When the last input is done I will let you know with the command "DONE". You will then concatenate all the inputs into one document.
And based on the output it works. Does it really, really work? There is only one way to find out. So I generated a random english text of 10.000 characters and split into ten chapters. In each chapter I inserted a sentence like ”One chicken was named Anders.”.
The names started with the letter that hade the same number in the alphabet as the chapters number so that I could keep track of which chapters thet chatGPT remembered. So in the first chapter the chicken was named Anders, in the second Berit, in the third Charlie and so on.
So my first test was to just insert the first five chapters and when done run the prompt:
Name the chickens
Sure! Based on the descriptions in the different paragraphs: Anders Berit Charlie David Edmund
Seems correct. So in a brand new chat I tried with six paragraphs and tried the same prompt.
Sure! In the first passage, one chicken was named Anders, and in the second passage, one chicken was named Berit, in the third passage, one chicken was named Charlie, in the fourth passage, one chicken was named David, and in the fifth passage, one chicken was named Edmund.
The last chicken Fredric was unaccounted for so after about 5000 characters chatGPT stopped reading the text.
So instead I tried concatenating with the prompt:
I will provide you with several inputs. I want you to remember the names of the chickens and when I say done you will provide me with the names
For each input chatGPT had me updated with the names of the chickens:
One chicken was named Anders. Two chickens were named Anders and Berit. Three chickens were named Anders, Berit, and Charlie.
This was fine until I pasted my fifth paragraph containing the chicken named Edmund and I got this quite strange reply:
The four chickens named in the previous passages were Anders, Berit, Charlie, and David. There was no mention of a chicken named Edmund.
The last sentence really puzzles me.
So I continued with the next input and now chatGPT stopped giving me the names of the chickens and instead started to summarize my last input. This indicates that the first instructions hade been forgotten which were posted about 5000 characters ago. but I continued with my ten paragraphes told it that I was doneand gave it my prompt:
according to my previous instructions; what are the names of the chickens?
The names of the chickens mentioned in the story are George, Harald, Isac, and Johan.
So It gave me the names of the last four chickens which means that the first six chapters were discarded. So I asked it:
how many inputs did i make?
You made 5 inputs.
So I gave it chapter 1 again with the chicken Anders and prompted:
according to my previous inputs name the chickens
According to your previous inputs, the names of the chickens are Harald, Isac, Johan, and Anders.
And as we can see the chicken George was thrown away in favour of Anders.
So I pushed it a bit further and gave it the second chapter with the chicken Berit and prompted:
According to my inputs name the chickens
The chickens' names were Isac, Johan, and Berit.
Strangely enough it throw away both Anders and Harald when Berit was inserted.
Eventually chatGPt grew really tired of me and replied ”Err” to any prompt, even in new chats.
OTHER MODELS AND THE API
But this is a limit in the chatGPT implementation. When I used the API and the davinci-003 model I got the correct reply:
The chickens were named Anders, Berit, Charlie, David, Edmund, Fredric, George, Harald, Isac, and Johan.
The gpt-3.5-turbo model did it’s thing turbo fast by being creative and just make up the answer:
One chicken was named Kari.
So we can conclude that there is a limit in the model and it is about 5000 characters long so be careful when trying to summarize large sets of data. It will probably fail silently and in a confident way tell you that everything is in order.