New Professional Photo


Last week we did staff photos at the library. This one was taken by Bryan Hull. Time to update all my profile pictures everywhere! The one I’ve been using is from February 2012, taken by Marsha Pirone at Kresge Library, Dartmouth College. I don’t look that different, do I?

Library Carpentry Instructor Training

Last month I attended the Portland Library Carpentry Instructor Training, co-taught by Belinda Weaver and Tim Dennis, with helpers John Chodacki and Juliane Schneider. This was the first “Train the Trainer” session for librarians, covering some basics of educational psychology and instructional design, and is the first step to getting certified as a Software Carpentry AND Data Carpentry instructor. Thanks to csv,conf,v3 and the California Digital Library for sponsoring this program!

Tweets can be found under the hashtag #porttt. You can also read Juliane’s recap of this and related events. We had a lot of fun learning and getting to know each other. Personally, much of it felt like a condensed review of ACRL’s Immersion Program – Teacher Track. Hilariously, I sat next to Shari Laster, who also was in that same program 4 years ago with me!

Since I flew in early on Day 1 of the training and out late after Day 2, I wasn’t expecting to do much exploring around Portland. But lucky for me, Veronica Ikeshoji-Orlati and I got into some great discussions over post-Day 1 drinks and decided to continue chatting over dinner. We ended up walking all over town, ate at a decent sushi place (where I had ramen, which was only ok), and then capped off the night with ice cream at Ruby Jewel.

Adventures in #portland; day 1 of 2. #porttt #latergram

A post shared by Shirley Zhao (@shirl0207) on

Brian Avery was the only other participant from Salt Lake City, and he’s a professor at Westminster. We got a chance to talk on our way to the airport, and I learned about some really cool initiatives that he’s working on, especially with respect to teaching reproducibility to undergrads. I’m hoping he’ll give a talk about it sometime during this upcoming academic year at one of the programs that I’m working on here at Eccles Library (more on that in another post later).

Library Carpentry lessons are created and improved upon by volunteers so Mozilla Global Sprint was the perfect event to organize people for further development. Belinda spearheaded the Library Carpentry Sprint, which happened on June 1-2. Check out the Gitter channel where we communicate. Unfortunately, I was overwhelmed with my regular work duties that I didn’t get a chance to do any actual work on it! Betty Rozum at USU invited me to join their local meetup and I was able to via Google Hangouts for a short time, which was a lot of fun. When I finally have some free time, I’m planning to go back and figure out how I can build on where others have left off.

I have a few more steps to complete before I’m officially a certified instructor, but I already feel like I’m part of the community and am eager to start teaching some of the lessons under the banner! Also, I want to build on this project in particular: Reproducible Research using Jupyter Notebooks. But that’s a topic for a whole other post…

Journey to DSVIL

The next Data Science and Visualization Institute for Librarians (DSVIL) at North Carolina State University (NCSU) was announced in late November 2016 and naturally everyone made sure I knew about it. My job title may say “Data Science Librarian” but I’m always on the lookout for relevant professional development. I had come back from the Bibliometrics and Research Assessment Symposium earlier that month and was just winding down from hosting the Research Reproducibility Conference. For my own work, I am especially interested in creating a reproducible workflow to generate visualizations and documentation of research outputs and impact at different levels of the university. I have grand plans to utilize the ORCID Public API to build an interactive web dashboard that would pull in publication data and then visualize it — once I figured out how!

My knowledge of data science has so far been acquired piecemeal; I taught myself how to program in R and watched webinars on data visualization. Every few years I teach myself Python again to keep up with a second programming language; I’ve relearned it 3 times now. I have guest-lectured in undergraduate and graduate level courses on data visualization best practices, using the ggplot package to generate plots, and writing in LaTeX. However, I still currently lack the depth of knowledge (e.g. statistical analysis, data wrangling, using APIs) and practical experience in order to be a valuable member of a research team. DSVIL looked like the perfect program to build up additional knowledge and skills!

Costs & Funding

The tuition alone for this program is $2500 and includes an evening reception and daily breakfast/lunch, but not travel or lodging. Sticker shock led to my initial dismissal of the opportunity until my library director encouraged me to go for it. We would figure out a way to fund me to attend if I was accepted. So I applied for any possible external funding to help supplement EHSL funds:

  • MLA Continuing Education (CE) Grant [$500]
  • MLA MIS Career Development Grant [$1500]
  • NNLM MCR Professional Development Funding [$1500]

Happily I was accepted to DSVIL! Thanks to EHSL, MLA CE, and NNLM MCR for covering the tuition, travel expenses, and remaining meals. And thanks to extended family for housing me this coming week! Who knew Raleigh could be so expensive‽


As a condition of funding, I will be formally writing up this experience for NNLM MCR’s quarterly newsletter. I can’t promise I’ll have time to post daily here, but my goal is to have detailed notes at the end of it. I anticipate doing my usual tweeting using #DSVIL, but the schedule is tight and there’s a lot to learn! I am looking forward to meeting the other participants and being part of this network of data science librarians.

Recap of Bibliometrics & Research Assessment Symposium

Last week I attended Bibliometrics & Research Assessment: A Symposium for Librarians and Information Professionals, which was jointly organized by the National Institutes of Health Library and the SLA Maryland Chapter. Side note: I’m really impressed by whoever designed the logo for the banner and handouts.

There were a number of people tweeting throughout the event using #bibliometrics or #nihbibres16. Here’s my complete list of tweets, but major takeaways include:

Keynote addresses are archived for both Ludo Waltman (video; slides) and Katy Börner (video; slides), and available as tweets (thanks to PF Anderson for compilation). They’re both definitely worth watching if you have a couple hours to focus on bibliometrics and data visualization.

The poster session was a great opportunity to see what others in the same line of work are doing. Unfortunately, posters are not yet available online, but here’s the handout with the titles and authors. It was announced later that they will be collecting them after the symposium.

[2016/12/1] Update: Posters are up on the SLA Maryland Chapter’s website!

Takeaways from the afternoon discussion:

  • There are a lot of tools out there! Notable mentions: Sci2, Tableau, Cytoscape, Pajek, IN-SPIRE, BibXL, NodeXL, D3.js, histcite, R, Python, VOS Viewer
  • For cleaning data, OpenRefine is great for author or organization names. Gephi can also be used to clean but it’s more time consuming. VantagePoint, while not free, can do fuzzy matching.
  • For full text mining and analysis: NVivo, R (tm and topicmodels packages), Quosa (but confined to this platform)
  • Caution: pick peer comparators that are actually comparable.
  • On tracking publications of a researcher in different positions/institutions: make sure they have an ORCID. Use citation alert services. Search, researchgate, NIH RePORTER.
  • Go to research team meetings to talk about bibliometric/research assessment services rather than wait for them to come to you or your workshops. Be (pro)active. Important to meet with directors to clarify what metrics mean to address misinterpretation or misuse.
  • IN-SPIRE was actually designed to identify research gaps. Else, you can also search for “surprising” or “world changing” which may give some insight.
  • Altmetrics shows early impact, attention, and gives a nice visual (i.e. the donut). It’s useful to show you’ve reached the public.
  • Someone mentioned there’s a moderate relationship between Mendeley readers and citation count. I’ll have to verify that claim another day.

After the reception, I had a great dinner and conversations with Kris Alpi, Karen Gutzman, and Abby Adamczyk at a local Spanish tapas place. Kris and Karen both presented posters and are much deeper in providing bibliometric services at their institutions. Abby used to work at Eccles!

Chris Belter and Ya-Ling Lu held a hands-on training session the next day. Materials from the workshop are here. The focus was on where to find bibliometric features and learning/practice using the tools (VLOOKUP in Excel, Sci2, Gephi). Email me if you want my annotations on the workshop slides. NIH Library is planning to publish their workflow next year.

After the workshop ended, I popped over to check out the NIH Library and had the good fortune to meet Josh Duberman who gave me a personal tour. Then I had a lovely time catching up with Kathel Dunn at the National Library of Medicine. She was my mentor when I did my ARL CEP Fellowship at NLM over 5 years ago.

Overall, it was a great trip and learned a lot about bibliometrics, what tools to use for analysis and visualization, and what others are doing at their institutions. Plus, networking!

New Publications


Quick updates: what an exciting month this has been! This week I signed up for my figshare account and got a DOI for my LibGuides Project Team poster. My plan now is to turn this poster into a write up and publish it somewhere. Where? TBD.

Mellanye Lackey presented the other poster I was co-author on.


Earlier this month, my short piece for the MLA News Technology column was also published. If you’re an MLA member, please enjoy. If you are not, the gist of it is in my post here.

New Job (&) Title: Data Science Librarian

I’m a tad late in announcing it: my academic staff position at Eccles Health Sciences Library (EHSL) has transitioned to a full-time faculty position effective September 1 (3 weeks ago). My new title is officially “Data Science Librarian” and the best way to show you my job responsibilities and areas is with a wordle!

data science librarian wordle

So feel free to ask me about anything, especially in the areas noted above. Questions?

Highlighted Tools for Advanced Searching

This post is long overdue. In May at MLA 2016, I attended a CE course on Advanced Searching Techniques and Advanced Strategy Design with Julie Glanville and Carol Lefebvre. About half of the material they presented was completely new to me, and one of the most useful sections they covered was on tools that help with building a search strategy. During the conference, they each presented at a Sunrise Seminar and highlighted some more tools.

In no particular order, check these out for identifying subject headings, keywords, top authors, top publications, and other text mining capabilities…

In addition, they noted that EndNote can be used to create a “reference list” of MeSH or EmTree term frequency. I haven’t tried this yet, but sounds promising if you can figure out how to get the MeSH and EmTree terms imported into the EndNote library.

Other mentioned resources of interest:

For more on the CE course, see my storify:

For more on discussions around systematic reviews from MLA 2016, see