README

Metadata Editor Import(r-factory)

R integration module of the Metadata Editor application. This module contains R scripts, Node based R utility methods and test cases. This module passing data from NodeJS to R by using PanApps customized version of r-script. This module includes various data import export features as well as data analytic features such as resequence, spread metadata etc. Here are the list features..

import dataset from file formats SPSS / STATA / CSV
export dataset to different file formats
destring straing variables
resequence variables
spread metadata
export to dictionary
update variable status
calculate variable statistics

Prerequisites

Install Node and R if not installed. Set environment variable for windows.

Node 10.15.1
R version 3.3.3

Check whether R packages are installed and the version. If not please install using the command install.packages("package_name")

R packages

jsonlite (version: 1.3)
haven (version: 1.1.0)
plyr (version: 1.8.4)
stringr (version: 1.2.0)
labelled (version 1.0.0)
readr (version 1.1.1)

Installation

Install the dependencies and devDependencies.

npm install

Build the application

npm run build

Test the application

npm run test

Publish the application to npm

npm publish --access public

Running the tests

Unit test are written for each features. You can copy input files to test-data/input directory. Please see the commands to run unit test below.

Note:- Please start the editor before run the tests. Editor start the OpenCPU API server and it will be used in the unit test.

`npm run test`

unit test to check the dataset import/export functionalities. Keep only dataset files to be tested in the test-data/input/dataset folder, remove other files.

known issues - some datasets may fail the unit tests due to labelled integer validation while exporting to STATA dataset format(eg: cs1_pupil.dta)

flow of test execution :-

import datasets from test-data/input/datasets directory
export the imported files to test-data/output/datasets
import the exported datasets

`npm run test:resequence`

unit test to import dataset and perform resequence on the imported datasets. Drop the dataset files to be tested in the test-data/input/dataset folder and run command

flow of test execution :-

import datasets from test-data/input/datasets directory
perform resequence and write updated varable json file to test-data/output/json directory

`npm run test:destring`

unit test to check destring functionality in the imported file. Since we have to mention the variables to be destringed, the test is limited for a particular dataset "ghs_2015_person_v1.1_20160608.dta". Keep this file in the input folder and remove others while run the test.

flow of test execution :-

import datasets from test-data/input/datasets directory
perform destring to the selected variables and write the updated csv file to test_data/output/csv/ directory

`npm run test:dictionary`

`npm run test:dictionary:stata`

`npm run test:dictionary:spss`

unit test for export to dictionary format.

flow of test execution :-

import datasets from test-data/input/datasets directory
export dataset to test_data/data-dictionary/ directory

`npm run test:validateKey`

unit test to check the unique key constraint for the given key variable of a dataset.

Steps :-

Copy the data file to test-data/input/datasets directory.
Set the data datasetname and keyVariables in dist/test/validation.unit.test.js constructor method
run the command

flow of test execution :-

import dataset from test-data/input/datasets directory
validate the key variables

Contributors

Navin VI (navin.v.i@panapps.co)
Anoop Xaviour (anoopx@panapps.co)
Libin Thomas (libint@panapps.co)

License

MIT

@wbg-mde/r-factory

Usage no npm install needed!

README

Metadata Editor Import(r-factory)

Prerequisites

R packages

Installation

Running the tests

`npm run test`

`npm run test:resequence`

`npm run test:destring`

`npm run test:dictionary`

`npm run test:dictionary:stata`

`npm run test:dictionary:spss`

`npm run test:validateKey`

Contributors

License