Go to content Go to menu

OR2018 Recap

62 days ago

We got this

NOTE: THIS WAS INITIALLY POSTED AS A DRAFT, it has been updated twice (see below). I reserve the right to add links to things that need them, as the idea occurs to me… but it’s mostly done now. —HJP 6/20/2018 10:41am CDT

Before OR2018, I went on vacation with my wife to Santa Fe, New Mexico. We drove from Missouri, so we were in the car for a while, and we checked out a book on CD from our library. The book we got was by Kelly McGonigal, who had recently spoken at a work retreat my wife attended… to give you a feel for where Kelly is coming from, here’s a TED talk of hers.

The book/CD set we checked out is called, The Neuroscience of Change: a compassion-based program for personal transformation.

Listening to this CD on a road trip was very relaxing… I told my wife I felt like I’d been on an all-day mindfulness retreat when we stepped out of the car.

Why am I bringing all this up? Well, while listening to this CD, I came to the realization that I have been resisting some change my career has been going through, and I also got in touch with a capacity I didn’t realize I had: a feeling, that, “I’ve got this.” I vowed to myself to take this confidence into OR2018. And I was startled to find that same confidence reflected back to me by everyone at Open Repositories, from the speakers to all of my colleagues. I can’t say whether I interpreted this “vibe” based on my own intention prior to the conference, or whether it was something other people could observe. I will say that the statement “Open Access has arrived” bounced around a bit, from speaker to speaker, and you could say that “we’ve got this” is a variation of that.

Enough preamble, on to the conference!

Informal Meetups day (Sunday, 6/3/2018)

I ran into my friend Dermot Frost in the airport in Denver, as well as Carloyn Cole from Penn State. We ended up hanging out after we landed in Bozeman. Carolyn lead the Valkyrie Code Read workshop, which was one of the things I was most looking forward to at this conference, and I definitely wanted to find out more about how they are using Valkyrie at Penn State. So, we wandered Bozeman for an afternoon. Carolyn proposed walking until we could get a clearer view of the mountains, Dermot and I agreed. We ended up walking to the very edge of Bozeman.


We got to the MSU Library in time for the “Informal Meetups” and I had a nice chat with a few of my DSpace friends. I broke the news that I’m going to be working more with Samvera and less on DSpace (which is the professional change I’ve been wrestling with, I mentioned above). I didn’t think this was news, I thought I’d told people, but it did seem to cause a few pouts. I’ll stick with the DSpace community and pitch in when I can, but… my focus will be Samvera from here on out. It’s just the way it is, my employer wishes it, I will make it so.

Workshops day (Monday, 6/4/2018)

Workshop: DSpace REST-API

If you’d like to follow the workshop, you can do it all on your own, self-guided.

One tool mentioned during this workshop, Postman looks very helpful, I installed it and have played around a bit. It’s a nice suite of tools to work with a REST-API, the kind of thing that helps you remember all the various things you need to remember, so you don’t have to jump between browser sessions and read docs, copy/paste between windows… Postman will help you keep track of all the complicated things, so you can focus on using the API. Even (and especially) logging in and maintaining a session for an authenticated API.

Workshop: Valkyrie Code Read


Valkyrie on GitHub

My random notes: I realized just how much of a newbie I am to Ruby. As Carolyn read through the code, I found myself googling about inheritance in Ruby. I found this page in this tutorial (tutorial start ) I would like to follow that tutorial later.

And this one line in Valkyrie made me just marvel I think that’s three maps deep?

I then started exploring the resources I found on Ruby, and discovered the Open Book Shelf I’ll have to return to that later.

More notes from my notebook: change sets are a key part of how Valkyrie works, and the code around them is pretty clear, this is where we dove in during the code read, if you want to duplicate the experience, or otherwise deepen your understanding of Valkyrie, start with change sets.

Also, a thing I have learned in the past, but was nice to see in practice during the code read: specs/tests are great docs on how things are supposed to work, so, if you’re lost, start with the tests.

I should note that, while Valkyrie is not on the draft Hyrax road map, it’s clearly on the “road map.” (Hat tip to Tom Johnson for that turn of phrase and road map link.) As I spoke with other Samvera community members, and listened to them speak during sessions, it’s very clear that the entire community has accepted the inevitability of Valkyrie becoming part of the stack we will all use. It’s a tool we all anticipate having in our toolbox, sooner rather than later (see below for more on this theme). I’ve been here before, it’s a touchy subject, but ask any DSpace community member about “DSpace 2.0.” :-) However, all joking aside, most of the ideas that were floated as part of DSpace 2.0 did eventually make it into the core DSpace, it just didn’t happen all at once. I do believe that Valkyrie is on its way in to the Samvera code base.

After the code read was done, there were other workshops starting up, so I wandered in to the Redbox workshop…

RedBox workshop

…Where I found out about this handy tool: Data Curator but it doesn’t run on Linux :-( (UPDATE: I’m wrong, it builds just fine on Linux, you need Yarn installed, and it builds an AppImage version, which is easy to install, yay! a new toy!)

However, if you need a quick and dirty CSV tool (which is not Excel), and you have Atom already installed, Tablr works well. Though it’s a tad unstable until you patch a bug (patches and workarounds are posted on that issue I just linked).

RedBox is neat, one can learn a lot from what they’re doing and how they’re doing it… and a related thing: GitLab is a really handy tool for automating all sorts of services our users/stakeholders might want provided to them. GitLab looks like a way for us to say “yes, we can do that for you” which is cool to see more of. I know one developer currently at UCLA who has built his own personal CD stack using Rancher and GitLab. I intend to try to copy his setup. I’ve been nagging him for his docs, however, I also know that there is a nice blog post about this kind of thing, so, I think I ought to be able to muddle through on my own.

Keynotes

The opening keynote, by Casey Fiesler, was entitled “Growing Their Own: Building an Archive and a Community for Fanfiction”. Recording, Slides

It was inspiring to see what a group of dedicated volunteers could achieve, in bootstrapping a community-driven repository of user-generated content. I recommend watching this keynote.

The closing keynote, by Asaf Bartov, was entitled “Free Culture in the Periphery: A Personal Perspective” Recording, Slides
This keynote was similarly inspiring, seeing what a dedicated community of volunteers is capable of, as well as the struggle and challenges this community faces.

I want to say more about one particular challenge, which I noticed during Asaf’s keynote, but I’ll save that for another time.

GT01: Samvera (June 5, 2018)

Esmé Cowles, from Princeton, started off this session with an introduction to Valkyrie, the process of how and why it came about, and the philosophy behind its development. Here is a remix of the video from the presentation with the slides added. I recommend watching it. Esmé’s talk kind of set the tone for Valkyrie for the rest of the conference, I think… it made it OK to talk about as if it’s a tool we can rely on being in our toolbox in the future.

UPDATE 6/20/2018: Oh, yeah, I had a part in this conference, too!

I was invited to serve as one of the Developer Track co-chairs, and after a brief wait for approval from my management, I said yes. It was pretty cool to help shape part of the conference. I’ve been a reviewer in the past, but rounding up and wrangling reviewers (and session chairs) for a track is surprisingly rewarding work. People are flattered you have asked them for help, and then they do help. I just want to say thank you to everyone who said yes to my pleas for help, I really appreciate it.


I especially want to say thanks to my fellow co-chair, Liz Krznarich, who was such a calm, steadying voice whenever I was inclined to simply freak out about whatever it was we needed to do. We got it done. Thanks, Liz!

Developer Workspaces panel

Here’s the abstract for the panel I proposed:

Some of us still develop the traditional way, and install the entire application stack on our own computers. But there are many other options available: Vagrant, Docker, or IDEs in the cloud. All approaches share the same aim: to minimize the effort required in standing up a new developer workspace, and to ensure this setup is shareable and repeatable. This panel will consist of live demos of all of these options, with plenty of opportunities to discuss best practices.

Here are the notes from the whole session (which includes links to all the slides). The panel was at the end of the session, so skip to the bottom of those notes if you just want to see the notes on the panel.

This panel was a lot of fun to do, (yes, even the live demo) and I hope it helps some people figure out what all these different tools are capable of, why one would choose to use them, and which is a good fit for what they want to do.

And because I skipped past the thank you slide at the end (it was a relief to be done!), here’s a link to that slide. Also, I’d like to thank all the panelists for the session, for agreeing to participate, and helping put together an amazing collection of work to demonstrate the current state of the art of developer workspaces. Begging your indulgence, I’ll just name them here (in alphabetical order): Terry Brady, Georgetown University Library, Liz Krznarich, ORCID, and Kate Lynch, University of Pennsylvania. Also, a huge thanks to former panelists who could not make it to OR: Erin Fahy, Stanford and Anusha Ranganathan, Cottage Labs. Even though they couldn’t make it, their participation and continuing advice helped shaped the content of the panel presentations. Thanks again, I think we made a great team, and I hope to work with all of you again some day.

UPDATE 6/20/2018 10:41am CDT: Ideas Challenge

It’s hard for me to resist the allure of the Ideas Challenge, and I joined a team this year. Our team name was “GDPR – Wranglers vs Sheriffs”, My team mates were: Janet McDougall, Senior Data Archivist, Australian Data Archive, Saskia van Bergen, Senior Project Manager, Leiden University Libraries and Harish Maringanti, Associate Dean for ITS, The University of Utah. Our proposed solution was to develop a checklist similar to the GDPR Checklist site, but with guidance more specific to repositories and research data. I wanted to produce a working demo based on the GDPR Checklist site’s code, however, the static site generator it uses, Gatsby.js, proved too difficult for me to set up while also attending sessions, so I set that aside and just gave a hand-wavy demo using the actual GDPR Checklist site. I’m happy to report that I continued tinkering with Gatsby.js on the way home, and my first day back home… and… I got it working after all. Gatsby seems like a cool tool, I will have to play with it more. As many people know, static site generators are an interest of mine. OH, I’m also happy to report that The Medical Research Council in the UK has some advice re GDPR so… if you’re worried about how GDPR might affect you as a researcher or someone who helps facilitate research data storage, check that out.

Random thoughts

DSpace 7 will be amazing!

DSpace 7 slides
DSpace 7 demo

DSpace 7 will be amazing! Why? 1) Configurable entities (i.e. you can customize the data model!), this is potentially sharable with other repository shared data model work going on now. 2) ResourceSync is supported out of the box. 3) An industry-standard REST-API, courtesy of Spring Data REST, and a UI based on Angular 2. DSpace will feel like a desktop application! Expect to play with the beta in early 2019 (maybe earlier), it should be out and ready for deployments by next OR. Want to play earlier than that? They could use the help.

It’s exciting to see at least two communities rallying around the idea of customizing and sharing data models. It’ll be good to have at least two robust options for reflecting the sometimes complex metadata models our content requires of repository and digital library folks. Oh, and if you’re interested in this topic, I recommend checking out CASRAI. (Hat tip to Tim Donohue for that link.)


Over the past few months I have been looking into getting Hyrax (aka a Samvera reference implementation) set up on my work notebook. I think I’m close, and I wanted to share my working notes here, in case you want to follow along…

You must first love Ruby

OK, maybe “love” is too strong of a word, but you’ll at least need to install it, if you haven’t already.

Install Ruby

You’ll probably want to use a dedicated tool to manage Ruby versions, because that’s part of the fun of Ruby—you’ll need to use a different version of Ruby some day. And that day is going to be hard enough without trying to un-do or work around some other way of installing Ruby. Trust me.

My favorite way to install Ruby is with rbenv however uru is supposedly easier and cross-platform. And there are many other methods.

Gem Install Rails and Railties

You’re not going to be running all of Rails on your workstation (though you could, but there are a bunch of dependencies for Hyrax, and you’re going to use Docker to manage that mess). However, you’ll need the Rails gem installed so you can use CLI-based Rails generator to build a new Hyrax application. And you’ll need Railties version 5.0.6 so you can run the preferred Rails generator (Railties is the gem that manages the generator stuff). So, as soon as you have Ruby installed, run these commands:

gem install rails
gem install railties:5.0.6

An early milestone of success: let’s generate a Hyrax application!

We have all the pieces we need, so let’s get this out of the way now. Run this:

cd path/to/your/project/or/workspace/folder
rails _5.0.6_ new awesomenameforyournewapp -m 
https://raw.githubusercontent.com/samvera/hyrax/v2.1.0.beta2/template.rb

Blam, that felt good.

Gem Install Stack Car

We’re going to use Notch8’s Stack Car gem to help us manage Docker and Docker-Compose competently.

gem install stack_car

OK, don’t get too excited, but you’re almost ready to seriously hack on Hyrax. But first, you do have Docker and Docker-Compose installed, right?

Install Docker and Docker Compose

Sorry, that will probably be an epic journey of discovery. Docker seems to work better on Linux than any other OS… I’ve heard good things about OSX. But, this is a terse guide, and those links will get you started. Come back when you have Docker and Docker Compose installed. Good luck!

Right, back to Hyrax and Stack Car

cd path/to/your/project/or/workspace/folder/awesomenameforyournewapp
sc dockerize .

Oooh, shiny. Now, as awesome as this is, you’ll need to make some adjustments.

Add the following to the default .env file:

REGISTRY_HOST=hub.docker.com
REGISTRY_URI=/phusion/passenger-ruby23
TAG=0.9.28

Remove the last 4 lines from the docker-compose.yml file

   depends_on:

volumes:

^^^ that may or may not be necessary, but on the version of Docker and Docker Compose I was using, it was.

Change the ports line in the docker-compose.yml file to read

    ports:
      - "3000:80"

Let’s start this baby up

Now we can crank up our Dockerized Hyrax app:

sc up

Have fun

You should be in a good place to further explore what Hyrax is and what you can make it do. Need a place to start? Start here:

This is a work in progress!

So, I don’t yet have access to any documentation on how Stack Car works or what you should do next. I wish I did. If you’re following along, you can muddle through with me, or you can wait patiently for me to update this blog post with links to more documentation.

UPDATE: so, you’ll have a instance of Hyrax running on http://127.0.0.1:3000/ but you’ll need to do a couple of things before it’s actually usable.

1. You’ll need to run db:migrate:

sc be rails db:migrate

2. You'll need to chown and chmod your tmp folder
sc exec bash
chown root tmp
chmod 777 tmp
# note this is really naughty, but you know, it's a dev environment, so get over it

Now check http://127.0.0.1:3000/ and you should see your new Hyrax site waiting for you to hack on. Get to it, buddy. Oh, for the db:migrate command, you might need to send a slightly different one. Rails will tell you what to run. You’re almost there.

UPDATE2: If you’re running Docker on a Linux notebook (I am) all sorts of things will be slightly off while you’re working with Stack Car. I’m not sure of the cause, I’m researching, but it’s something to do with the way the named mounts are loaded with the Docker Compose file created by Stack Car. The main problem is that Docker wants to run as root, which means files created by root in the containers are owned by root (hence the janky chown and “rootme” permissions up above). There are workarounds for most of the issues, but the one that I can’t seem to fix is that whenever a rails command is run on the container, the files that rails command creates will be owned by the root user. That ownership translates over to the host. Which means I won’t be able to edit those files. Which is a real downer as far as developer experience is concerned (the entire point is to be able to work with these files… not being able to work with them makes me very table-flippy). I suspect a bit more cleverness with file permissions might be enough to hobble along… but… I also suspect there may be a simple thing I can change in the Docker Compose file to have these mounts work correctly without any monkey business. I suppose the real question is: is it worth my time to invest any further effort to get this working environment to work for me or should I move back to Vagrant—a tool I trust to deliver a usable (albeit rather slow) working environment.

UPDATE3: I think this is the source of the magic on OSX… apparently things just work over there? More research required.

UPDATE4: Docker uses something called a storage driver to handle how containers talk to storage on the host computer. it looks like my default storage driver was set to aufs, which isn’t quite the same as overlay2, which is the recommended storage driver. So, I’ve followed this suggestion and now hoave overlay2 set as my Docker storage driver. I then did a bunch more research because that config change was not enough to handle the permissions issues I am encountering. And I found this on StackOverflow, so I added a :z at the end of my volume lines in docker-compose.yml, like so:

solr:
      image: solr:latest
      env_file:
       - .env
       - .env.development
      ports:
       - "8983:8983"
      volumes:
       - './solr:/opt/solr/server/solr/mycores:z'
      entrypoint:
       - docker-entrypoint.sh
       - solr-precreate
       - samvera
       - /opt/solr/server/solr/mycores/config

the :z flag tells the storage driver to pass through the permissions from the host folder and files. With this flag in place, you can run

chmod -R 777 solr

in your host working directory, and you’ll have proper permissions set up, so you can edit those solr configs on your host and then have Docker load the Solr container correctly. The same strategy applies to all the other mounts you might want to work with (like the one for the web container, where all the app files and configs go). After you run a rails task to generate new MVC files, you’ll probably need to run the following in a terminal on your host:

find -type f -user root | sudo xargs chmod 777

You can then edit the files the rails tasks leave for you, via an editor or IDE (Atom or RubyMine), on your host. Is this annoying that you have to tinker with permissions all the time? Yes… but it does work, and you’ll only have to do it once for each file you create. Probably you could get creative with a sticky bit and setting the GID on the container… that will be fun for another day.

Speaking of fun, it turns out you may need to run another task before you can create the default admin_set. The docs may be out of date, so I made a ticket for that issue

TLDR:, here’s what you need to type in order to get your Hyrax app running in Stack Car:

sc be bundle install
sc be rails hyrax:default_collection_types:create #may not be necessary, but there's no harm in re-running it
sc be rails hyrax:default_admin_set:create

And you need to add an admin role to the role_map.yml file.

Basically you need to follow along with the getting started guide ENJOY!

UPDATE5: Here are a few other helpful links for getting started: