Coming at SEND from a preclinical background has been a strange experience. One of the most unusual things about it, from my perspective, is the use of SAS transport files to convey data. Why wrap deliveries up in an arcane binary format where you have to jump through hoops to consume the data? I guess the idea is that most people in safety assessment are able to deal with this file format due to the software they must have to meet regulatory requirements. The same can’t generally be applied to the groups earlier in the drug discovery process, where Excel remains the tool of choice for dealing with data. So to understand the example data set provided by CDisc, I needed to find a way of opening and investigating the data files. The fastest way for me to do this on my Windows machine turned out to be:
- Grab a Python environment
- Grab the Python xport extension someone has very kindly developed
- Write some very simple Python to extract all the .xpt files to .csv’s
This was a great leg up to understanding the content of the files, how the domains are organised etc. However I need a mechanism of writing the files, and this library is a one way conversion. so I had to look elsewhere for an answer.
In the end I turned to the statistical package R. The R library SASxport provides converters for xpt files which are pretty easy to use. Here is a typical script, which takes a csv input and creates an xpt output:
library(SASxport)
args<-commandArgs(TRUE)
input <- args[1]
output <- args[2]
dat=read.csv(input, header=TRUE)
write.xport(dat, file=output)
I then wrapped this in a Windows batch file:
set R_Script=<PATH to RScript.exe>
%R_Script% <PATH to actual script> <PATH to csv input> <PATH to csv output> > output.log 2>&1
This runs the R script passing in the two parameters, and emitting any errors to a log file. This logfile proved to be very handy in tracking down errors.
I was then able to spawn this batch file from within Node.js:
var ls = require(‘child_process’).spawn(‘cmd’, [‘/s’, ‘/c’, ‘<PATH to batch file>’], { windowsVerbatimArguments: true });
The extra step of spawning a cmd and running the batch file by passing it in as a parameter is down to the fact that Node’s spawn commands is looking for an executable, not a batch script. I guess I could bypass the batch file step and call RScript.exe directly. I’ll try that now.
Anyway, the result of this is I’m now able to create xpt files within Node, so I can trigger the creation of SEND files from my server without using any complicated libraries or SAS components.