@wbg-mde/r-factory

Metadata editor R integration module

Usage no npm install needed!

<script type="module">
  import wbgMdeRFactory from 'https://cdn.skypack.dev/@wbg-mde/r-factory';
</script>

README

Metadata Editor Import(r-factory)

R integration module of the Metadata Editor application. This module contains R scripts, Node based R utility methods and test cases. This module passing data from NodeJS to R by using PanApps customized version of r-script. This module includes various data import export features as well as data analytic features such as resequence, spread metadata etc. Here are the list features..

  • import dataset from file formats SPSS / STATA / CSV
  • export dataset to different file formats
  • destring straing variables
  • resequence variables
  • spread metadata
  • export to dictionary
  • update variable status
  • calculate variable statistics

Prerequisites

Install Node and R if not installed. Set environment variable for windows.

  • Node 10.15.1
  • R version 3.3.3

Check whether R packages are installed and the version. If not please install using the command install.packages("package_name")

R packages
  • jsonlite (version: 1.3)
  • haven (version: 1.1.0)
  • plyr (version: 1.8.4)
  • stringr (version: 1.2.0)
  • labelled (version 1.0.0)
  • readr (version 1.1.1)

Installation

Install the dependencies and devDependencies.

npm install

Build the application

npm run build

Test the application

npm run test

Publish the application to npm

npm publish --access public

Running the tests

Unit test are written for each features. You can copy input files to test-data/input directory. Please see the commands to run unit test below.

Note:- Please start the editor before run the tests. Editor start the OpenCPU API server and it will be used in the unit test.

npm run test

unit test to check the dataset import/export functionalities. Keep only dataset files to be tested in the test-data/input/dataset folder, remove other files.

known issues - some datasets may fail the unit tests due to labelled integer validation while exporting to STATA dataset format(eg: cs1_pupil.dta)

flow of test execution :-

  • import datasets from test-data/input/datasets directory
  • export the imported files to test-data/output/datasets
  • import the exported datasets

npm run test:resequence

unit test to import dataset and perform resequence on the imported datasets. Drop the dataset files to be tested in the test-data/input/dataset folder and run command

flow of test execution :-

  • import datasets from test-data/input/datasets directory
  • perform resequence and write updated varable json file to test-data/output/json directory

npm run test:destring

unit test to check destring functionality in the imported file. Since we have to mention the variables to be destringed, the test is limited for a particular dataset "ghs_2015_person_v1.1_20160608.dta". Keep this file in the input folder and remove others while run the test.

flow of test execution :-

  • import datasets from test-data/input/datasets directory
  • perform destring to the selected variables and write the updated csv file to test_data/output/csv/ directory

npm run test:dictionary

npm run test:dictionary:stata

npm run test:dictionary:spss

unit test for export to dictionary format.

flow of test execution :-

  • import datasets from test-data/input/datasets directory
  • export dataset to test_data/data-dictionary/ directory

npm run test:validateKey

unit test to check the unique key constraint for the given key variable of a dataset.

Steps :-

  • Copy the data file to test-data/input/datasets directory.
  • Set the data datasetname and keyVariables in dist/test/validation.unit.test.js constructor method
  • run the command

flow of test execution :-

  • import dataset from test-data/input/datasets directory
  • validate the key variables

Contributors

  • Navin VI (navin.v.i@panapps.co)
  • Anoop Xaviour (anoopx@panapps.co)
  • Libin Thomas (libint@panapps.co)

License

MIT