MCPDM Week 9 – Revision

So we’ve come to Week 9 of MCPDM and classes will soon be over!

In the afternoon Yunhyong asked us what we thought we had learnt about digital preservation over the last few months. I initially could not think of a thing, but after 20 minutes of “automatic writing” on post-it notes it turned out that we actually knew quite a bit. Not only that, after comparing our post-its to the OAIS model it seems like we are subconscious experts on that as well!!

The only, and quite major, area my group missed out on was the actual end user. As someone mentioned at the start of this week – what is the point of preservation if you do not have access? We swore to better our ways…. Which does not stop us from enjoying our amazing post it charts all the same! They are the colours of the rainbow:


Automatic Writing


Automatic Writing put into OAIS model

Metadata – it’s the very berries!

18th February catch up – In the lab today we had a look at metadata in context. The exercise involved checking out what kind of metadata you could find from a range of file formats using – right click properties and There were a few glitches to start with but once we got exploring it became very interesting. My particular favourite was the Eiffel Tower picture. We opened this first in extractmetadata, then in exifdata, and were astonished to see the gps co-ordinates showing up on google maps as slap bang in the middle of Washington, USA.  This is very intriguing. My thoughts are, something has messed up the metadata or someone has taken a picture of a picture of the Eiffel Tower while standing in Washington. Any other ideas? Since doing this exercise I have had lots of fun at home checking out many of my own photographs. Metadata is wonderful. The more I know about it the more I understand why it is so valuable, but ‘how much’ does need to be weighed against the time and money involved in generating it in an archive.  The Dublin Core/Premis elements which relate to ‘rights’ particularly interest me as I wrestle with the complexities of Copyright Law, Data Protection and FOI.Washington

Lightning response

Response to Disordered Beings’ Blog saving thoughts….that’ll be the 28th January! Yeh,  who’s disordered now?

Hmmm – Having read the disorder of being’s gallusly imaginative approach to our task (what did they have for lunch?) my response would be that by picking out little bits to save you reduce the value. None of this material is unique, but the entirety with its interwoven strands and offshoots does make it interesting.  Surely we don’t just preserve for scholarly reasons? Remember that eclectic collection of stuff we brought in to ‘show and tell’ in records and evidence…….everybody say ahhhh!

Selection and Appraisal –

Within the traditional criteria for appraisal we decided that this site has no administrative, legal, or fiscal value. There is some evidential, informational, and/or research value, in so much as the site could be used to show literary trends in a  particular genre and readership group. It could also be used as an example of how books are promoted online at this time. Researchers into social history or marketing could get data from this site. The site has very good up to date links to social media, Amazon (for purchasing featured books) and a blog to promote engagement and interaction. The intrinsic value of the site is that it is a good example of of communication within a special interest group. However it is hard to decide how COMMERCIAL this site is. Money is being made somewhere!

Using the The Data Alliance for Social Sciences (DataPASS) guidelines for social science data we reached similar conclusions. The cost considerations for long-term maintenance of the data is something we considered regarding keeping the look and function of the site. We concluded that the entirety should be saved as it would be detrimental to the user experience to loose any of it. The whole site doesn’t appear very big but it has a lot of images so this might have an impact on storage costs, and maintenance of the links could be problematic.

The only issue which is not addressed by either criteria is that of AFFECTION. It should be considered as part of the appraisal process as the community involved may love it and want to re-live their memories. 😉 ❤

Checksums Checksums!

In week 2 of MCPDM we were discussing Authenticity, Integrity and Reliability in regards to digital records. One of the key things which I took away from it is how information content becomes more important than form in the digital environment, due to the difficulty in maintaining access to old file formats. These were questions which we were then practically faced with in the afternoon lab. Using a Checksum calculator we compared the content of a number of different files. The results showed (as seen below) that the information which the files contained was the same for “1.a” and “1.b”, and for “2.a” and “2.b”, despite different formats and titles. This can be seen by how they share the same Checksum.

1.aMCPDM 41.b MCPDM 3

2.aMCPDM 2 2.bMCPDM 1

So what are the practical implications for information professionals? On the one hand Checksums seem a practical (and time saving) tool for a Record Manager in scoping out edited copies and degraded files.  On a more fundamental level it does seem to be a question of integrity. Checksums can signal that a file has not been edited since even small changes, such as upper case characters to lower case characters, completely change the sum. However, as discussed in the morning class, ‘Authenticity’ and ‘Reliability’ does not follow automatically. We struggled to confirm Authenticity (that the file is what it purports to be) since we were unsure of its purpose. The lack of information meant that we could not confidently say that the files were reliable, despite confirming some level of integrity. This shows that the wider context is essential for being able to confirm any of these essential qualities.

After seeing this difficulty, I look forward to learning more about how an appropriate and supportive context can be created for these types of records in the following weeks.

Also, Kate is really bossy and would not allow me to post this week’s blog unless I sent her the Checksum for the Word document containing these images.  She’s gone power mad! So here goes: CEA235CAF10FBBFBE5965DAD975C282F

Oh one

01101101 01100001 01100100 01100101 00100000 01111001 01101111 01110101 00100000 01101100 01101111 01101111 01101011 00100001

Very Best Wishes for your health and wealth in 3737, or 11BB. Name the animal who’d like 3737, and guess which politician might feel comfortable with the jolly British 11BB.

Well done to our Impress team member for sorting the whole jingbang in record time! I got off to a slow start being unsure about shoot and corrupt as a game. As is so often the case, a little quiet contemplation and discussion with (interrogation of) my fellow IMPs and ‘I get it’. The Sherlock Puzzle was good.

Danny Hillis’ explanation of where we are and where we are headed with information was thought provoking. My first thought is because we can store it, should we? It’s hard to make decisions about future usefulness.

Is data the new oil? When I first heard this phrase I naively thought of oil/data as a lubricant which allows things to operate smoothly. A good example being data/information which is engineered to be shared across organisations in health and welfare, from the cradle to the grave, would be hugely advantageous. I’m sure most of us have experienced frustrated professionals unable to take action just because the necessary information cannot be accessed. Information is the oil which allows action.

Big data = new oil = big bucks Is it time to redefine anonymous?

Damaged files and Questions

Damaged Files

Unfortunately some of the clean files wouldn’t open on the computer I was using so I couldn’t really compare all of the undamaged files with those I damaged, but was able to see that the word file lost all of the diagrams when it was damaged.

The damage to the PDF file resembled the image damage more, with blocks of black and grey appearing over some of the diagrams.  The computer recognised there was damage to the file and warned me of this before I opened it.

Of the image files, the TIFF file seemed to suffer little damage when I ‘shot’ it, but when I corrupted it the image the damage really showed.  The Tiff file and the JPEG file, though very obviously damaged, would still have been readable, however the PNG and GIF files were completely undecipherable.


3 Questions

  1. On March 11, 1968, President Lyndon B. Johnson mandated that all computers purchased by the United States federal government do something. What was it?

A. March 11 – U.S. President Lyndon B. Johnson mandates that all computers purchased by the federal government support the ASCII character encoding.


Google search “march 11, 1968, president lyndon b. Johnson” – answer via  “1968 in the United States”


  1. The vigesimal system was memorably employed by a U.S. president when dedicating a cemetery. Which president, and what decimal value did he express in vigesimal notation?  Hint: in Old Norse, a notch on a stick used to tally values in vigesimal notation was called a “skor.”

A. Four score and seven years – Gettysburg Address


Google search: vigesimal – and found that it meant base 20 numeral system so knew it was the  Gettysburg Address – but hadn’t known it was a dedication ceremony for a cemetery!  – Looked Gettysburg Address up on Wikipedia to check the exact quote!


3.  freeformatter-output

A.  ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/


Wikipedia search on “base 64” which gave the answer.


I know I seem to have put my faith totally in Wikipedia, but I do really like Wikipedia!  I know that it can be unreliable but if I wasn’t sure I would check the facts via other sites.  For example once I knew the answer to the base 64 question via Wikipedia, I checked again via a few other sites and got the same result!