@tricoteuses/assemblee

Retrieve, clean up & handle French Assemblée nationale's open data

Usage no npm install needed!

<script type="module">
  import tricoteusesAssemblee from 'https://cdn.skypack.dev/@tricoteuses/assemblee';
</script>

README

Tricoteuses-Assemblee

Retrieve, clean up & handle French Assemblée nationale's open data

Retrieval of open data (in JSON format) from Assemblée nationale's website

mkdir ../assemblee-data/
npx babel-node --extensions ".ts" -- src/scripts/retrieve_open_data.ts --fetch ../assemblee-data/

Reorganizating open data files and directories into cleaner (and split) directories

npx babel-node --extensions ".ts" -- src/scripts/reorganize_data.ts --no-validate-raw ../assemblee-data/

Note: These reorganized files are also available in Tricoteuses / Data / Données brutes de l'Assemblée. They are updated on a regular basis.

Validation & cleaning of JSON data

npx babel-node --extensions ".ts" -- src/scripts/clean_reorganized_data.ts ../assemblee-data/

Note: These split & cleaned files are also available in Tricoteuses / Data / Données nettoyées de l'Assemblée with the _nettoye suffix. They are updated on a regular basis.

Retrieval of députés' pictures from Assemblée nationale's website

npx babel-node --extensions ".ts" -- src/scripts/retrieve_deputes_photos.ts --fetch ../assemblee-data/

Retrieval of sénateurs' pictures from Assemblée nationale's website

npx babel-node --extensions ".ts" -- src/scripts/retrieve_senateurs_photos.ts --fetch ../assemblee-data/

Retrieval of pending amendments from Assemblée nationale's website

(Pending amendments are amendments waiting to be processed by Assemblée services.)

npx babel-node --extensions ".ts" -- src/scripts/retrieve_pending_amendments.ts --incremental ../assemblee-data/

Retrieval of documents from Assemblée nationale's website

npx babel-node --extensions ".ts" -- src/scripts/retrieve_documents.ts --textes ../data/assemblee-textes ../data/assemblee-nettoye/Dossiers_Legislatifs_XV_nettoye/documents/**/*.json

Test loading everything in memory

Test loading small split files

npx babel-node --extensions ".ts" --max-old-space-size=2048 -- src/scripts/test_load.ts ../assemblee-data/

Test loading big non-split files

npx babel-node --extensions ".ts" --max-old-space-size=2048 -- src/scripts/test_load_big_files.ts ../assemblee-data/

Note: The big non-split open data files should not be used. Use small split files instead.

Initial generation of TypeScript & JSON schema files from JSON data.

npx quicktype --acronym-style=camel -o src/raw_types/acteurs_et_organes.ts ../assemblee-data/AMO{10,20,30,40,50}_*.json

Edit src/raw_types/acteurs_et_organes.ts to:

  • Replace r("Secretaire02") with "".
  • Remove 2 definitions of Secretaire02 and replace it with string elsewhere.
npx quicktype --acronym-style=camel -o src/raw_types/agendas.ts ../assemblee-data/Agenda_{XIV,XV}.json
npx babel-node --extensions ".ts" --max-old-space-size=8192 --  src/scripts/raw_types_from_amendements.ts ../assemblee-data/

Edit src/raw_types/amendements.ts to:

  • Replace r("ActeurRefElement") with "".

  • Remove 2 definitions of ActeurRefElement and replace it with string elsewhere.

  • Replace r("AuteurRapporteurOrganeRefEnum") with "".

  • Remove 2 definitions of AuteurRapporteurOrganeRefEnum and replace it with string elsewhere.

  • Replace r("Code") with "".

  • Remove 2 definitions of Code and replace it with string elsewhere.

  • Replace r("CodeMissionMinefi") with "".

  • Remove 2 definitions of CodeMissionMinefi and replace it with string elsewhere.

  • Replace r("DivisionRattacheeEnum") with "".

  • Remove 2 definitions of DivisionRattacheeEnum and replace it with string elsewhere.

  • Replace r("GouvernementRefEnum") with "".

  • Remove 2 definitions of GouvernementRefEnum and replace it with string elsewhere.

  • Replace r("GroupePolitiqueRefEnum") with "".

  • Remove 2 definitions of GroupePolitiqueRefEnum and replace it with string elsewhere.

  • Replace r("LigneCreditLibelle") with "".

  • Remove 2 definitions of LigneCreditLibelle and replace it with string elsewhere.

  • Replace r("PrefixeOrganeExamen") with "".

  • Remove 2 definitions of PrefixeOrganeExamen and replace it with string elsewhere.

  • Add:

    export interface Amendements {
      textesLegislatifs: TexteLegislatif[]
    }
    
  • Add:

    export interface AmendementWrapper {
      amendement: Amendement
    }
    
  • Add:

      "Amendements": o([
          { json: "textesLegislatifs", js: "textesLegislatifs", typ: a(r("TexteLegislatif")) },
      ], false),
    
  • Add:

      "AmendementWrapper": o([
          { json: "amendement", js: "amendement", typ: r("Amendement") },
      ], false),
    
  • Add the following static methods to class Convert:

      public static toAmendements(json: string): Amendements {
          return cast(JSON.parse(json), r("Amendements"));
      }
    
      public static amendementsToJson(value: Amendements): string {
          return JSON.stringify(uncast(value, r("Amendements")), null, 2);
      }
    
      public static toAmendementWrapper(json: string): AmendementWrapper {
          return cast(JSON.parse(json), r("AmendementWrapper"));
      }
    
      public static amendementWrapperToJson(value: AmendementWrapper): string {
          return JSON.stringify(uncast(value, r("AmendementWrapper")), null, 2);
      }
    
npx quicktype --acronym-style=camel -o src/raw_types/dossiers_legislatifs.ts ../assemblee-data/Dossiers_Legislatifs_{XIV,XV}.json

Edit src/raw_types/dossiers_legislatifs.ts to:

  • Replace regular expression r\(".+CodeActe"\) with r("CodeActe").
  • Remove definitions of regular expression [^ ]+CodeActe; and replace it with CodeActe;.
  • Add import { CodeActe } from "../shared_types/codes_actes" on top of file.
  • Remove occurrences of "[^"]*CodeActe": and replace them with one "CodeActe": Object.values(CodeActe),.
  • Remove occurrences of CodeActe \{.
  • Replace regular expression r\(".*OrganeRef"\) with "".
  • Remove definitions of regular expression [^ ]*OrganeRef and replace it with string.
  • Replace regular expression r\(".*DossierRef"\) with "".
  • Remove definitions of regular expression [^ ]*DossierRef and replace it with string.
  • Replace regular expression r\(".*AuteurMotion"\) with "".
  • Remove definitions of regular expression [^ ]*AuteurMotion and replace it with string.
  • Replace regular expression r\(".*DenominationStructurelle"\) except for DocumentDenominationStructurelle with "".
  • Remove 2 definitions of regular expression [^ ]*DenominationStructurelle except for DocumentDenominationStructurelleand replace it with string.
npx babel-node --extensions ".ts" -- src/scripts/merge_scrutins.ts -v ../assemblee-data/
npx quicktype --acronym-style=camel -o src/raw_types/scrutins.ts ../assemblee-data/Scrutins_{XIV,XV_fusionne}.json

Edit src/raw_types/scrutins.ts to:

  • Replace r("ActeurRef") with "".

  • Remove 2 definitions of ActeurRef and replace it with string elsewhere.

  • Replace r("GroupeOrganeRef") with "".

  • Remove 2 definitions of GroupeOrganeRef and replace it with string elsewhere.

  • Replace r("MandatRef") with "".

  • Remove 2 definitions of MandatRef and replace it with string elsewhere.

  • Replace r("ScrutinOrganeRef") with "".

  • Remove 2 definitions of ScrutinOrganeRef and replace it with string elsewhere.

  • Replace r("SessionRef") with "".

  • Remove 2 definitions of SessionRef and replace it with string elsewhere.

  • Add:

    export interface ScrutinWrapper {
      scrutin: Scrutin
    }
    
  • Add:

      "ScrutinWrapper": o([
          { json: "scrutin", js: "scrutin", typ: r("Scrutin") },
      ], false),
    
  • Add the following static methods to class Convert:

      public static toScrutinWrapper(json: string): ScrutinWrapper {
          return cast(JSON.parse(json), r("ScrutinWrapper"));
      }
    
      public static scrutinWrapperToJson(value: ScrutinWrapper): string {
          return JSON.stringify(uncast(value, r("ScrutinWrapper")), null, 2);
      }
    

Updating JSON schema files and validating JSON files

  • Convert src/types/*.ts into JSON schemas for comparison purposes
for f in src/types/*.ts ; do b=$(basename $f .ts) ; npx typescript-json-schema src/types/$b.ts '*' > src/schemas/converted_from_type/$b.json ; done
  • Manually update src/schemas//.json to account for these differences
  • Verify the JSON files validate with the updated schema
npx babel-node --extensions .ts -- src/scripts/validate_json.ts --repository=$(git rev-parse --show-toplevel) --dataset ../data/assemblee-nettoye/AMO*nettoye
npx babel-node --extensions .ts -- src/scripts/validate_json.ts --repository=$(git rev-parse --show-toplevel) --dataset ../data/assemblee-nettoye/Dossiers_Legislatifs_XV_nettoye
etc.

If an error occurs and the schema must be fixed:

  • Verify the schema works by using --dev to use the schema from the current working directory instead of fetching them from the tag maching the version mentionned in the JSON file. For instance, if the file acteurs/PA766283.json has schemaVersion = "acteur-1.0" it will use the schema found at schema-acteur-1.0 and not the current working directory, except if --dev is used.
  • Once the schema is verified to work, add a tag matching the directory of the schema. For instance for amendement/Amendement.json or any of its references (i.e. amendement/*.json), set the tag schema-amendement-X.Y.
    • If the schema change is backward compatible (i.e. software using the corresponding JSON won't break), increment Y (X.1, X.2, ...)
    • If the schema change is not backward compatible, increment X and set Y to zero (1.0, 2.0, ...)

The tag with the highest version will be used by src/scripts/clean_reorganized_data.ts to add a schemaVersion field for all JSON files created in a *_nettoye repository from that point on. The goal is for a JSON file to validate against an immutable schema identified by a version tag and to all each JSON file to have a different version of the schema.

See the discussion in the forum for more information and further discussion.

Helpers to create documentation

$ npx babel-node --extensions .ts -- src/scripts/document_dossiers_legislatifs.ts --data ../data/assemblee-nettoye/Dossiers_Legislatifs_{XIV,XV}_nettoye/dossiers/**/*.json

See the data-site README for more information about how it is used.

Obsolete or Now Useless Scripts

Validation & cleaning of big non-split files

npx babel-node --extensions ".ts" --max-old-space-size=8192 -- src/scripts/clean_data.ts ../assemblee-data/

Note: The big non-split open data files should not be used. Use small split files instead.

Retrieval of députés' non open data informations from Assemblée nationale's website

npx babel-node --extensions ".ts" -- src/scripts/retrieve_deputes_infos.ts --fetch --parse ../assemblee-data/