## Introducing debkeepr

### An R package for the analysis of non-decimal currencies

After an extensive period of iteration and a long but rewarding process of learning about package development, I am pleased to announce the release of my first R package. The package is called debkeepr, and it derives directly from my historical research on early modern merchants. debkeepr provides an interface for working with non-decimal currencies that use the tripartite system of pounds, shillings, and pence that was used throughout Europe in the medieval and early modern periods. The package includes functions to apply arithmetic and financial operations to single or multiple values and to analyze account books that use double-entry bookkeeping with the latter providing the basis for the name of debkeepr. In a later post I plan to write about the package development process, but here I want to discuss the motivation behind the creation of the package and provide some examples for how debkeepr can help those who encounter non-decimal currencies in their research.

You can install debkeepr from GitHub right now with devtools, and I am planning to submit the package to CRAN soon. Feedback is always welcome and any bug reports or feature requests can be made on GitHub.

# install.packages("devtools")


## Pounds, shillings, and pence: lsd monetary systems

The system of expressing monetary values in the form of pounds, shillings, and pence dates back to the Carolingian Empire and the change from the use of gold coins that derived from the late Roman Empire to silver pennies that had taken place by the eighth century. Needing ways to count larger quantities of the new silver denarius, people began to define a solidus, originally a gold coin introduced by the Emperor Constantine, as a unit of account equivalent to 12 denarii. For even larger valuations, the denarius was further defined in relation to a pound or libra of silver. Though the actual number of coins struck from a pound of silver differed over time, the rate of 240 coins lasted long enough to create the custom of counting coins in dozens (solidi) and scores of dozens (librae). The librae, solidi, and denarii (lsd) monetary system was translated into various European languages, and though the ratios between the three units often differed by region and period, the basic structure of the system remained in place until decimalization began following the French Revolution.1

## One Year Anniversary

### Some reflections

It was a year ago today that I launched this website with a post introducing myself and the goals for the blog. When I launched the site, I was at the beginning of teaching myself about digital humanities and how to code in R. Building this Hugo site was part of my learning process. I learned about the command line, Git, and GitHub, not to mention a bit of HTML and CSS to get the website up and going. I also wanted to share my progress and provide information for others who might want to go down a similar path. Over the past year I have gone from a coding newbie to actively working on two digital humanities projects and closing in on the launch of my first R package that makes it easier to work with historical non-decimal currencies.

After writing a couple of posts on my thoughts about digital humanities, I have concentrated on writing posts about using R to analyze and visualize historical data. I have always tried to adopt the perspective of a newcomer to code and to R, since I was in that position not all that long ago. The rsats community has done an admirable job in trying to make learning R as approachable as possible, but there is no way around the fact that learning to code is a daunting task. My posts have mainly focused on using R for GIS and network analysis since these are the two topics most pertinent to the digital humanities projects I have been working on. Writing posts on these topics proved to be the best way for me to learn not only how to create maps and network graphs with R but also about the fields of GIS and network analysis in general.

Even if no one read my posts they would have been useful endeavors for all that I learned through writing them. However, the most gratifying aspect of writing this blog has been hearing that the content has proven useful to others in their efforts to learn R. It has been particularly amazing and amusing to hear from people in the sciences and think about biologists and chemists learning to code by analyzing letters from a sixteenth-century merchant in the Low Countries, which is the data set that I have used in my posts.

I never could have imagined that this website and my blog posts would be read by very many people, and I have been amazed by the steady stream of visitors that continue to read the website daily. The digital humanities and rstats communities have been amazingly open and welcoming, and I never could have gotten this far nor had this much fun without the kindness of the people in these communities. I have received very generous encouragement and increased visibility through tweeting out my posts from Maëlle Salmon, Mara Averick, Hadley Wickham, and the great R4DS community among many others. To these individuals and to the entire community that has come to read my blog I can only say thank you.

In a nice little coincidence this is the 11th post to the blog and pushes my first introductory post to the second page of blog entries. I am looking forward to the next year of continuing to learn more about digital humanities and R and writing about my experiences on this blog, pushing more and more posts to the second page and beyond.

## Great Circles with R

### Three methods with sp and sf

In 1569 the Flemish cartographer and mathematician Gerardus Mercator published a new world map under the title “New and more complete representation of the terrestrial globe properly adapted for use in navigation.” The title of the map points to Mercator’s main claim for its usefulness, which he expounded upon in the map’s legends. Mercator presented his map as not only an accurate representation of the known world, but also as a particularly useful map for the purposes of navigation. As described in the third legend, Mercator aimed to maintain conformity to the shape of land masses even towards the poles and to have straight lines on the map accurately represent directionality. To achieve his goals Mercator used a projection in which lines of longitude and latitude were made perpendicular at all values by increasing the distance between degrees of latitude as they reach the pole.1 Mercator’s projection had the benefit that straight lines drawn on the map are rhumb lines, lines of constant bearing that pass every degree of longitude at the same angle. Theoretically this simplified oceanic navigation; a ship captain could draw a straight line from one port to another, calculate the bearing, and maintain that bearing along the voyage. However, 16th-century navigators used magnetic courses and not longitude and latitude values as Mercator’s map assumed.2 An accurate means to measure longitude at sea was only discovered in the second half of the 18th century with the development of the sextant and later the marine chronometer.3

World Map by Gerardus Mercator, 1569

The Mercator projection was designed with certain uses in mind. Mercator’s emphasis on perpendicular lines of longitude and latitude and the equivalence of straight lines and rhumb lines were meant to simplify navigation and have recently proved useful for online mapping services. However, the stretching of latitudes towards the poles distorts the size of land masses, making those closer to the poles appear larger than those near the equator. The stress on rhumb lines in Mercator’s map also highlights the difference between lines of constant bearing (rhumb or loxodrome lines) and the shortest distance between two points (great circles). Due to Earth’s ellipsoidal nature, the shortest distance between two points is not necessarily a straight line. For instance, to fly from Los Angeles to Amsterdam, one would not want to fly in a straight line of constant bearing at 78 degrees. Instead, you would want to make an arc to the north to take advantage of the ellipsoidal shape of the Earth. By flying along the great circle from Los Angeles to Amsterdam one would travel 1120 kilometers less than flying along the rhumb line.

## An Exploration of Simple Features for R

### Building sfg, sfc, and sf objects from the sf package

My previous post provided an introduction to the sp and sf packages, showing how the two packages represent spatial data in R. There I discussed the creation of Spatial and sf objects from data with longitude and latitude values and the process of making maps with the two packages. In this post I will go further into the details of the sf package by examining the structure of sf objects and how the package implements the Simple Features open standard. It is certainly not necessary to know the ins and outs of sf objects and the Simple Features standard to use the package — it has taken me long enough to get my head around much of this — but a better knowledge of the structure and vocabulary of sf objects is helpful for understanding the effects of the plethora of sf functions. There are a variety of good resources that discuss the structure of sf objects. The most comprehensive are the package vignette Simple Features for R and the overview in Chapter 2 of the working book Geocomputation with R by Robin Lovelace, Jakub Nowosad, and Jannes Muenchow. This post is based on these sources, as well as my own sleuthing through the code for the sf package.

Before diving in, let’s take a step back to provide some background to the package. The sf package implements the Simple Features standard in R. The Simple Features standard is widely used by GIS software such as PostGIS, GeoJSON, and ArcGIS to represent geographic vector data. The sf package is designed to bring spatial analysis in R in line with these other systems.1 The standard defines a simple feature as a representation of a real world object by a point or points that may or may not be connected by straight line segments to form lines or polygons. A simple feature can contain both a geometry that includes points, any connecting lines, and a coordinate reference system to identify its location of Earth and attributes to describe the object, such as a name, values, color, etc. The sf package takes advantage of the wide use of Simple Features by linking directly to the GDAL, GEOS, and PROJ libraries that provide the back end for reading spatial data, making geographic calculations, and handling coordinate reference systems.2

## Introduction to GIS with R

### Spatial data with the sp and sf packages

The geographic visualization of data makes up one of the major branches of the Digital Humanities toolkit. There are a plethora of tools that can visualize geographic information from full-scale GIS applications such as ArcGIS and QGIS to web-based tools like Google maps to any number of programing languages. There are advantages and disadvantages to these different types of tools. Using a command-line interface has a steep learning curve, but it has the benefit of enabling approaches to analysis and visualization that are customizable, transparent, and reproducible.1 My own interest in coding and R began with my desire to dip my toes into geographic information systems (GIS) and create maps of an early modern correspondence network. The goal of this post is to introduce the basic landscape of working with spatial data in R from the perspective of a non-specialist. Since the early 2000s, an active community of R developers has built a wide variety of packages to enable R to interface with geographic data. The extent of the geographic capabilities of R is readily apparent from the many packages listed in the CRAN task view for spatial data.2

In my previous post on geocoding with R I showed the use of the ggmap package to geocode data and create maps using the ggplot2 system. This post will build off of the location data obtained there to introduce the two main R packages that have standardized the use of spatial data in R. Thesp and sf packages use different methodologies for integrating spatial data into R. The sp package introduced a coherent set of classes and methods for handling spatial data in 2005.3 The package remains the backbone of many packages that provide GIS capabilities in R. The sf package implements the simple features open standard for the representation of geographic vector data in R. The package first appeared on CRAN at the end of 2016 and is under very active development. The sf package is meant to supersede sp, implementing ways to store spatial data in R that integrate with the tidyverse workflow of the packages developed by Hadley Wickham and others.

There are a number of good resources on working with spatial data in R. The best sources for information about the sp and sf packages that I have found are Roger Bivand, Edzer Pebesma, and Virgilio Gómez-Rubio, Applied Spatial Data Analysis with R (2013) and the working book Robin Lovelace, Jakub Nowosad, Jannes Muenchow, Geocomputation with R, which concentrate on sp and sf respectively. The vignettes for sf are also very helpful. The perspective that I adopt in this post is slightly different from these resources. In addition to more explicitly comparing sp and sf, this post approaches the two packages from the starting point of working with geocoded data with longitude and latitude values that must be transformed into spatial data. It takes the point of view of someone getting into GIS and does not assume that you are working with data that is already in a spatial format. In other words, this post provides information that I wish I knew as I learned to work with spatial data in R. Therefore, I begin the post with a general overview of spatial data and how sp and sf implement the representation of spatial data in R. The second half of the post uses an example of mapping the locations of letters sent to a Dutch merchant in 1585 to show how to create, work with, and plot sp and sf objects. I highlight the differences between the two packages and ultimately discuss some reasons why the R spatial community is moving towards the use of the sf package.

## Introduction to Network Analysis with R

### Creating static and interactive network graphs

Over a wide range of fields network analysis has become an increasingly popular tool for scholars to deal with the complexity of the interrelationships between actors of all sorts. The promise of network analysis is the placement of significance on the relationships between actors, rather than seeing actors as isolated entities. The emphasis on complexity, along with the creation of a variety of algorithms to measure various aspects of networks, makes network analysis a central tool for digital humanities.1 This post will provide an introduction to working with networks in R, using the example of the network of cities in the correspondence of Daniel van der Meulen in 1585.

There are a number of applications designed for network analysis and the creation of network graphs such as gephi and cytoscape. Though not specifically designed for it, R has developed into a powerful tool for network analysis. The strength of R in comparison to stand-alone network analysis software is three fold. In the first place, R enables reproducible research that is not possible with GUI applications. Secondly, the data analysis power of R provides robust tools for manipulating data to prepare it for network analysis. Finally, there is an ever growing range of packages designed to make R a complete network analysis tool. Significant network analysis packages for R include the statnet suite of packages and igraph. In addition, Thomas Lin Pedersen has recently released the tidygraph and ggraph packages that leverage the power of igraph in a manner consistent with the tidyverse workflow. R can also be used to make interactive network graphs with the htmlwidgets framework that translates R code to JavaScript.

This post begins with a short introduction to the basic vocabulary of network analysis, followed by a discussion of the process for getting data into the proper structure for network analysis. The network analysis packages have all implemented their own object classes. In this post, I will show how to create the specific object classes for the statnet suite of packages with the network package, as well as for igraph and tidygraph, which is based on the igraph implementation. Finally, I will turn to the creation of interactive graphs with the vizNetwork and networkD3 packages.

## Geocoding with R

### Using ggmap to geocode and map historical data

In the previous post I discussed some reasons to use R instead of Excel to analyze and visualize data and provided a brief introduction to the R programming language. That post used an example of letters sent to the sixteenth-century merchant Daniel van der Meulen in 1585. One aspect missing from the analysis was a geographical visualization of the data. This post will provide an introduction to geocoding and mapping location data using the ggmap package for R, which enables the creation of maps with ggplot. There are a number of websites that can help geocode location data and even create maps.1 You could also use a full-scale geographic information systems (GIS) application such as QGIS or ArcGIS. However, an active developer community has made it possible to complete a full range of geographic analysis from geocoding data to the creation of publication-ready maps with R.2 Geocoding and mapping data with R instead of a web or GIS application brings the general advantages of using a programming language in analyzing and visualizing data. With R, you can write the code once and use it over and over, while also providing a record of all your steps in the creation of a map.3

This post will merely scratch the surface of the mapping capabilities of R and will not enter into the domain of the more complex specific geographic packages available for R.4 Instead, it will build on the dplyr and ggplot skills discussed in my brief introduction to R. The example of geocoding and mapping with R will also provide another opportunity to show the advantages of coding. In particular, geocoding is a good example of how code can simplify the workflow for entering data. Instead of dealing with separate spreadsheets to store information about the letters and geographic information, coding makes it possible to create the geographic information directly from the letters data. The code to find the longitude and latitude of locations can be saved as a R script and rerun if new data is added to ensure that the information is always kept up to date.

## Excel vs R: A Brief Introduction to R

### With examples using dplyr and ggplot

Quantitative research often begins with the humble process of counting. Historical documents are never as plentiful as a historian would wish, but counting words, material objects, court cases, etc. can lead to a better understanding of the sources and the subject under study. When beginning the process of counting, the first instinct is to open a spreadsheet. The end result might be the production of tables and charts created in the very same spreadsheet document. In this post, I want to show why this spreadsheet-centric workflow is problematic and recommend the use of a programming language such as R as an alternative for both analyzing and visualizing data. There is no doubt that the learning curve for R is much steeper than producing one or two charts in a spreadsheet. However, there are real long-term advantages to learning a dedicated data analysis tool like R. Such advice to learn a programming language can seem both daunting and vague, especially if you do not really understand what it means to code. For this reason, after discussing why it is preferable to analyze data with R instead of a spreadsheet program, this post provides a brief introduction to R, as well as an example of analysis and visualization of historical data with R.1

## My Approach to Digital Humanities

Digital humanities holds the promise of increasing the means by which scholars are able to analyze and present data. Though some sentiments about the significance of digital humanities might be overblown, there is no doubt that the more ways we have to analyze sources the better. Learning a variety of the tools that make up the rather nebulous universe of digital humanities is like learning a new language. It opens up new possibilities that were previously closed or necessitated the expertise of others. This frames digital humanities as a collection of skills rather than a means to a predetermined end. I have adopted this perspective in learning about the possibilities opened by digital humanities and working on a digital humanities project. I am hardly the first person to take these steps, but I hope that by explaining my thought process I can set a basis for future posts on digital humanities.

If there has been one guiding force in my approach to digital humanities, it is to learn skills and tools in the process of production. In a way, this is simply the application of critical thinking to how I create my scholarly work. Instead of doing things in what seems the de facto manner, I have sought to question if there is either a more efficient way or a way in which I could gain or improve my competency. The goal of efficiency was particularly significant in thinking about how to organize my research and writing (DH 1.0), while learning new skills has been more important in the production of digital humanities projects (DH 2.0). This may not be the easiest way to complete a project in the short run, but by doing things the hard way, I am looking to open up new opportunities for future projects. With this theoretical approach in mind, let me now discuss a few concrete principles of my digital humanities practices concerning text, applications, and producing digital humanities projects.