The Effectiveness and Durability of Digital Preservation and Curation Systems

As a variety of digital preservation systems have emerged to address the changing needs of cultural heritage organizations, Ithaka S+R has undertaken an IMLS funded research project to assess the effectiveness and durability of different approaches to cloud and local preservation. We compare community based systems and commercial systems, interviewing representatives of system providers as well as their users. Track:Business & Sustainability & Capacity Building


Hello everyone, and thanks so much for joining for this for this conversation. I'm excited to present some of the findings from some preliminary research that we are conducting at Ithaca SNR, and, and then facilitate a discussion, so I will try to move through my slides quickly to give us a lot of time to talk afterwards. So, this research project funded by IMLS, it's an 18 month research project, we're calling the effectiveness and durability of digital preservation and curation systems, which are shortened to DPC s for remedy. And we're just about wrapping up, writing the report at the moment so we should have published research. By the end of the year. So here I'm going to introduce the project, talk a little bit about the methodology and share a few preliminary findings, and then open it up for discussion. First of all, I want to thank my colleagues, we Rhaegar and Roger Schoenfeld who have been really instrumental in bringing this research together. And the three of us are the project team for for this work. And I also want to thank our advisors, we've had some great advisors on this project Mike furlough, Carol Mandel, Robert Miller, Veronica race esquerda. Kathryn Skinner and Don waters, so they've been really helpful and brought a lot of expertise to the research. And so I'm going to talk a little bit about how we've structured this, this research, and so we wanted to do a study, analyzing the landscape of preservation systems, from the perspective of business strategy, to understand the degree to which these systems as they currently are formed are sturdy and sustainable for the clients who are investing their content into them. This came, you know, out of concerns around PPN and some of the sustainability concerns that have come from recognizing that you know not all. All systems are going to be here forever. So in order to do this, we developed a taxonomy which of the current landscape of digital preservation. This involved a lot of desk research and resulted in a lot of metrics and categories around different types of preservation systems and what they do. We settled on trying to study the differences between one of the primary research goals here is to study the differences between community based and community based preservation systems and commercial preservation systems. And so we ended up selecting eight systems. The community based for Sam Farah Island Dora, AP trust and meta archive, and the commercial relived Nova preserve Becca archive medica and archive them. And once we selected those eight systems we conducted interviews with leadership from those systems, as well as staff relevant to the questions we were bringing to the organization, and we had two sets of interviews with them one focused on business strategy and one focused on user and client needs. And then we identified users of the system and interviewed them to get their feedback about how, how their experience has gone working with these systems.

So we gathered evidence through interviews, through desk research and then have developed system profiles, which will be published and made publicly available for each of these eight systems. And these are really like a factual outline or sketch of what these systems do, and what their places in the, in the marketplace. And we wrote up case studies that were more that were more analytical and provide those won't be published because they named each specific system, but they were shared with the system providers and with our advisory committee, and used to inform the writing of a narrative report which anonymizes Pacific systems we don't want to be, you know calling anyone out but it does share high level findings about the, the landscape and a lot of key takeaways from the research, so that'll be published. Towards the end of the year. Now, let me just minimize something here. So our goals included for this research included answering these questions What business approaches are used are used to plan and implement digital preservation curation systems. How are they different, how are different requirements and resources of cultural heritage institutions factored into the system development process, how to be initiatives develop sufficient capital and ability to navigate the landscape to maintain sustainability and how could grant funding guidelines or investment strategies improve outcomes.

And this is a useful graphic that we, we have used to, especially for folks who are kind of new to this arena to explain the complexities of digital preservation the digital preservation lifecycle. It really involves multiple steps, and often involves multiple systems. I'll, I'll, we can return to this in the discussion if it's helpful, I don't want to spend too much time on it but just to point out that we're considering the full digital preservation lifecycle as we conduct this work. And in terms of our findings, we've, we found that there are some really interesting differences between community based systems and commercial based systems. One of them being the difference between scoping out your clients, so you know, community based systems, think AP trust or meta archive these, you know, have really specific communities that they serve. Higher Education communities, mostly, and they are very in touch with those communities and there's, there in the community needs and. Alternatively, you have commercial preservation systems that are really looking to diversify their client base, and they may be working with cultural heritage institutions, but they also may be working with banks or tech companies or pharmaceutical companies, in order to, you know, increase revenue. And this is kind of an interesting issue because in some ways, the diversification of the client base makes the system more resilient in a market economy. On the other hand, it makes it perhaps more risky for cultural heritage institutions to invest in because, you know, in the event that a systems, revenue is only, only 1% or less than 1% of its revenues being generated by cultural heritage clients, you know, could it sort of wind down that line of work because it's not seen as as viable. So, so we found some interesting takeaways there. Also, with the way these organizations are LED there, they're led in very different ways and a lot of community based systems have have fairly involved governance models that can result in a lot of, you know committee based decision making processes. A lot of commercial systems have more, you know, traditional corporate leadership models that allow for swift executive decision making processes, they've kind of built in, you know feedback mechanisms to learn from their communities from the clients, but, but they centralized decision making. And so, so this could be a liability for community based systems that have to move swiftly in, you know, evolving technolog technology landscapes, when they don't have really a lot of centralized decision making power. We found that, in terms of clients who are using these systems. It can be very complex to build an internal consensus around preservation strategy. And, you know in a lot of cases, museums, but also higher education libraries. There is a need to coordinate across a lot of different departments and multiple departments may have, you know be plugging into different points of the preservation cycle with different tools in ways that aren't code, coherent across the institution. So, so that can be a real challenge for the system providers, and of course it ends up being a challenge for the clients as well. And so we're finding that, you know, and this may be obvious to those who have done this type of work by having buy in at senior levels of your organization, around a preservation strategy is really crucial to create those workflows and lines of communication that

will result in a coherent digitization preservation strategy, commercial systems have come to learn the value of offering exit strategies to their clients. This seems to have resulted from anxiety on the part of the clients around how they will, how they could get their content out of the system. In the event of a preservation system going under, and a reason. A big reason for this anxiety is that commercial systems don't make their code bases, open sourced the way many, many community based systems do. And so there's a real opacity and and a lack of transparency around how content is managed once it's given over to the systems. And so they've so commercial systems have developed these exit strategies where they basically put their code and and and the clients content in an escrow account that will be, you know, automated automatically provided back to the client in the event of the system breakdown and. And finally, we found there, you know a lot of commercial systems claim to be turnkey sort of all in one system, and that you can basically outsource all of your preservation needs to that system, and clients for kind of reported that wasn't the case for them, that there isn't really such a thing as a turnkey solution. In fact, even with the most robust commercial preservation systems you still had to upskill your staff you still to make investments in teaching your staff how to conduct preservation workflows, and you needed to, you know, invest in other systems as well. And basically there are, there are hidden costs that are basically unavoidable and when, you know, a leader of an institution wants to just kind of get rid of the preservation problem and just pick a, you know, pick something off the shelf, and that tends to not be the best strategy. So for our discussion I know I went through that kind of quickly, but I want to make sure to give us lots of time to talk. These are just a few prompts that I have been thinking about and curious to ask the brain trust here. One is, you know, best practices, what are the best practices for building consensus, internally, and designing workflows across departments in order to effectively develop a preservation strategy. Another would be, what's your preference between investing in stock skills versus outsourcing I want to kind of test that hypothesis we were developing that. It's kind of unavoidable to make investments in staff preservation skills. Also, what content types our current cultural institutions focused on preserving at the moment. I'm particularly curious about you know born digital content. And, you know, social media content, emails and content that seems like it could scale up the content itself could scale up quite rapidly, and to what degree does access and discovery. Information preservation strategy so of course we have, you know, preservation systems that offer dark archives and that aren't intended for an access but there's another argument that we encounter here that, and, you know, access discovery and relevance really drives really protects a collection in a lot of ways, so. So I'm curious to hear your thoughts on that so I will pause on these questions here. And of course, these are just some prompts I thought of, but I'd love to hear. Go in other directions if there are other points of interest so I'll just pause here and and open it up to the group to have a discussion.

Okay, I can start with kind of a question. Statement. I think one of the difficulties that not specifically with preservation systems but with technological systems for collections in general, is that I think that there's a great understanding among staff in the field that staff skills are required. But when digital preservation is a new activity, even though you know preservation as preservation and the format of the things that really matter for the preservation purpose. Install preservation. There's, there's difficulty, from a funding perspective, where it's sometimes easier in a budget to say this is the cost of the thing, because we are paying Company Y for the thing versus here, and there's a contract, we are bound to pay a company wide for the thing versus here as a staff person, that represents a new position. Therefore additional operating cost, and the staff position can then be taught. So, one of the difficulties that project I'm working on is coming up against is, you know, we need a new system, it's possible to have the new system be funded by grants, not possible to have the ongoing operation of the new system funded by Grant, because that would be an operating budget expense which most founders are unwilling to participate. So I guess that's my question statement is in the outs, the staff skills, versus outsourcing is not really a matter of what is practical, it's what is feasible. From a budget in the perspective of the people who manage the money, who sign off on that operating budget.

Right, yeah, this is really interesting to find that this falls into that category of, you know, operational expenses not being often covered by by funders and, can I ask you to share a little about what organization this is or, you know projects. This is, yeah. So,

this is for the colleges and historic Deerfield collections management database so it's not a preservation system, but there are other parallel projects that are working on preservation systems. So, we're using a reason medzi XG, We need to, it's a very complicated bespoke partition system, we need to move to something else. I'm pretty confident that we can acquire grant funding to supply their system. What we cannot acquire grant funding for is to train everybody to make sure that their state that they stay trained to hire new or train existing database administrators to pay for the ongoing, you know, if it's a cloud based system, that's an operating expense, not a capital expense. So you know there's there's just many budgetary concerns about how to proceed in a way that is sustainable from a budget perspective. Okay, we're trying, we're replacing a system that we bought in 1995. Right.

And then you have cut further complexities where you have a relationship with like a central IT at the consort, is there no concern

there. There are six IT departments. Right, okay. It's incredibly complex, it's not even worth taking time to explain right now but we have an entire session on that November night, so please come to it if you're if you're really interested in the nitty gritty, but I guess I'm interested in, In what other folks in the session have. How they have advocated for digital preservation, because I think in a lot of, for a lot of lay people, scanning it and saving it forever. What do you mean, you know, costs and things, and how to advocate for that costing long term money.

Does anyone else have a similar situation on their hands or have gone through something like this. Hi,

I'm Jeanie Choi from the Metropolitan Museum. And we do have a digital preservation system, which we use for our accessioned time based media artworks. Our challenge is when we are fortunate enough to have a time based media conservator on staff. Unfortunately that position is funded as donor funded. So, unless we keep finding money to support it. That position is at risk of going away. But we were able to use another set of donor funds to acquire a system implemented and and start using it but the risk was so high for losing this artwork that we really just needed to get started right away to do it. And we treat the digital preservation system, if it's like a storeroom. We don't, we don't plan on accessing files from it, we have acts that we keep access copies for things that are born digital. So it's really a storm because it wasn't appropriate to keep our digital works on a network share, which was happening. So that's why we decided to use this system, but it is difficult because funding is a challenge for us, especially around conservation of this material.

Could you speak at all to how, like the the org structure path that was, you know, needed to get buy in for that decision.

Um, it wasn't that I mean we didn't have to go to the director level we had support we worked very closely with our conservation department. In this case it was our photo, our photo Conservation Department actually handles all our time based media so we worked with them to to get it approved and through, you know, we do get assumed that DEP has approval but we didn't have to go up to the director or deputy director level. But hopefully, the conservative position becomes full time, and that is working, you know, at the higher level.

Does anyone else have any thoughts or or stories to share on how to get systems and stock funded for preservation work.

Well, I'll just share that, you know, in our conversations with clients system clients, it. I was surprised to see sometimes, you know, that even wealthy, museums, would, you know, seem to have not not have a real really well developed preservation strategy, and they might even, you know, hire a contractor. And, you know, and, and set up an agreement with a system, and really, like, have that be the solution. And it seems to me that, you know, one of the, one of the benefits of thinking strategically about how your organization's going to preserve materials, is that you start to develop in house, expertise, and, and I can prove very valuable because, you know, as one person leaves or if there's turnover or you have a contract. I feel like this, the technology in this space is changing so rapidly that it can be very hard to, to produce good results in an ad hoc basis that way. So I'm curious if anyone has had any reflections on that or experiences related to that.

So if not maybe I'll just open it up to see if there are any other topics folks would like to discuss during this time. I think we have just a few more minutes. Just a few more minutes left here.

Happy to jump in. In regards to the question that that Jeannie was talking about in terms of funding and presenting the case for these new systems especially, I jumped in and played a very small support role with genies team and, and the other teams that worked on this. And what I found really compelling was that the moment that I got into that space, they already had the case for why this was needed, they had a detailed version of it, they had the one page version of it, right. They had a plan of how much it would cost to implement, and how much it would cost to continue operations for, and they were able not only to identify we were very lucky to have funding available to use to just get this kick started but they were able to identify how these costs would be effective, how we could reduce costs, elsewhere to ensure that this worked right. And I think there was a compelling case, just due to the fact that these systems, you know are not a peripheral need they're an absolute necessity if you want to maintain your data, especially your artwork. And so you know I mean, sometimes you have to assume that the bandwidth of the people that you're going to be trying to convince of this is really, really short. So the quicker case you can make make about the critical nature of this kind of work to implement what the impact of it is, and what the impact of not having it would be I think the easier it becomes to just show that this is absolutely something that we need to be moving towards, not only is just like a side project that this one piece of the organization is doing. That really is the operations of the future.

And I wonder if you could speak a little bit more about the, have there been ways that the organization uses the data that is collected in the process of preservation for analytic purposes or anything like that.

Well, Jeannie, you might be able to speak better than I can to this, I know that we've been able to pull information out of our archives and present them as part of our 150,000 aversary I don't think that has anything to do with data analysis but. But yeah, Jamie, do you have any thoughts.

I don't think we've been asked to provide sort of analytics, um, you know we report on how many times each media works we have, how much storage space like the analog some. Sometimes these media have very complex, installation, supplemental material, let's say. So we need to track that but we don't, we don't we haven't had to report on sort of analytics in terms of what we're storing in our digital preservation system.

And when you think about it is interesting to consider right if we're storing artworks. Essentially in these preservation systems we're storing valuable information storing something that has a value that can be reused in the future. Right, either as content on the website or content for future exhibitions or frankly content for historical analysis. So considering what that value is and what that value can be is, it's kind of an exciting and daunting idea.

I mean it was, it's also a lot of educating your colleagues about why preservation is so important, especially for boring digital artworks so you don't realize that files can get corrupted and can no longer be accessed or file formats can become obsolete or late you need, you know code geo located storage system storage options. So a lot of it was, you know, us being educated and educating other colleagues who needed to do approvals, but, again, for us, it's the artwork, you wouldn't, you know, put as our colleagues and eyes and he said you wouldn't store put a painting in a server room. So we needed to treat all our artworks, you know, with the highest level of care. So, that that would, I think that really helped our case.

And can I ask what system it was that you landed on

Laura we allowed to say that it was a system that you did not mention in your intro spinner that

we did do. We did do a, an RFP we went through an entire process of comparison for it. I think there were some systems that were mentioned there I

think I missed.

I heard I in some of our interviews with the staff and system providers they they bemoaned the RFP process and cultural heritage institutions which I think is like, you know, that's the, that's the territory that's a fair and good way to conduct work but of course they have, You know, they work with, in other sectors that, you know, are able to just like, make a decision, very quickly on some of these matters. And, and that was something that gave me some concern about you know their ability to effectively serve cultural heritage institutions like they're looking at this from a very different perspective than some of the community based systems where they're saying, you know, what's the cost of a sales cycle, you know and and I lindora is not really like asking that question in there, you know, product development, so. So I thought that was kind of just an interesting cultural difference between the commercial and community based systems that could result in some, you know, I don't know, some, some product outcomes, you know that are not currently anticipated. I mean our

procurement department requires us to cheat go to get to go through this process so and I think a lot of nonprofits, probably because money is always a challenge funding is very difficult. In many cases, we have to go through this, to show that we did our due diligence and our you know our getting the best, the most appropriate system most appropriate cost. So, sort of, there was no other option really.

Yeah, I think, I think that's, you know, fairly normal I think, you know five colleges, you probably have had the same, the same, same process right and yeah it doesn't seem like there's a great way around it, it just seems like something that could end up making relationships between the community based systems and cultural heritage organizations like more robust. If you've just add time to it. I wonder, Oh, go ahead.

I was also really interested in your first bullet point in the discussion slide I wonder if anyone has examples or stories from their harrowing tales from workflows across departments, more and more interested in workflows across like functional units, as opposed to, you know, European art versus African art like. Do any of you have workflows that relate to the IT staff, or, you know, libraries, and art departments, because there are, I think many ways in which, especially in university systems. They're you know skills exist in other departments that don't exist in the museum, and sometimes we rely on those other departments, and I'm wondering if they were examples or stories of, you know, enabling preservation by tapping into people and resources, and, you know, departments outside the museum.

I guess maybe not a lot of others have experienced with that. But, um, yeah, I think that it's especially difficult, from my understanding, in, you know, academic museum spaces where higher ed. The, the university administration. You know centralizes it functions and makes decisions, you know, based on that administration's goals, and that can often be you know the museum, sometimes it's not like in the top priority list for a university administration's goals right and so, um, so I think that, you know, I also did some research. A few years ago, an academic library and museum, collaboration, and, and, and in that research we found that libraries often have more access to senior administration in the in the university then museums do. And the outcome of that research was really, you know, coming to understand the importance of building relationships on campus. It was often the case that, you know, if a museum, you know, typically the library director and the museum director would have great, you know, respect for one another's work, but sometimes it would be the case that the library director had to be the one advocating for the museum's agenda in meetings with provosts and so and so I think that the museum can be at a disadvantage on campus. But there are ways of, you know, building soft power and building these relationships up that can help to get the agenda to get you know, the kinds of the museums needs on the table for the discussions around like what systems, we want to invest in and so on.

So that's one perspective from some work I did a little while ago,

I read that report, it was great.

Thanks. Glad to hear that. I realized we're approaching three so I just want to pause and give another moment in case anyone has another topic of interest that they want to bring up.

Okay, great. Well, it's been really a pleasure to present to you all, and, and this was really interesting conversation I appreciate hearing some of these perspectives from colleges and from the Met, really interesting to hear about your work and how you're navigating these questions, and yeah just very grateful for your time, and looking forward to connecting down the road. Excellent. Take care.