# tlohde

paper trails

22:58 17/06/2025
898 words
contents

reference's references

Input a single DOI and see the temporal distribution of the references; and then the distribtuion of the reference's references. And so on back. Until, the beginning of time - or until I decide enough is enough.

status: idle

status: idle

why?

This is an extension of the previous thing, and was motivated by this reply on mastodon. I was curious, to see how quickly[1], the earliest reference hops back in time with each iteration. So I set to it.

goal

fears

method rambling

All data is grabbed from OpenAlex[4] and their handy API.

If the goal was just to go back one generation, I could have used the group_by=publication_year filter. Which would have been great, because all the results would have fitted onto a single page. Unfortunately, however, group_by doesn't return the ids, so instead I have to page through the results - which was a bit fiddly. Pages show a maximum of 200 results at a time. Additional fiddle was splitting up the queries into batches of 50. A little status update thing endeavours to keep the user informed.[5]

I don't think there's any danger of hitting the rate limits. But there might be. Requests aren't polite, to be polite, the calls to OpenAlex need an email address, and well, I thought I'd spare you the indignity[6] of having to input an email address.

In my testing[7] the first 3 generations usually happens pretty painlessly. Beyond that, you'll have to: (a) demonstrate you really want it; (b) be patient; (c) be lucky.

The earliest item in each generation is put into the table, with a link to its entry in OpenAlex.

plotting

First time using plotly's js library. I didn't try to get fancy, and let the defaults be. Handily, it includes functionality like double-clicking a legend entry shows only that (inc. re-scaling the axes appropriately).

It'd be nice if, upon hovering or clicking on a bar, I could show the list of underlying refernces.

A big family tree / network diagram would be nice too... Ditto a separate plot with generation on the y-axis and time on the x-axis.

next?

parking this for a while, i think. It's late.

useful services

and doubtless many others

footnotes


  1. or slowly ↩︎

  2. wobbly, and fragile and poorly structured js, because i don't know what i'm doing ↩︎

  3. the ID in OpenAlex ↩︎

  4. Priem, J., Piwowar, H., & Orr, R. (2022). OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. ArXiv. https://arxiv.org/abs/2205.01833 ↩︎

  5. if there is no response from OpenAlex, or the request hangs, I don't notify the user. sorry. check your browser's console and refresh the page. or better still: just give up ↩︎

  6. and me the hassle. it's really not much hassle come to think of it ↩︎

  7. that's me being very generous ↩︎


have thoughts? want to share? email me, or find me on mastodon where you can reply to this post