-
Made a list of all data attributes which need to be extracted.
-
ID (Data Identifier)
-
Channel(Data Source)
-
Grimoire-creation-date (date of github comment/mail)
-
Context (Subject of mail/Title of Github issue/PR title)
-
Body (Body of the comment/mail)
-
-
Made enrichers for Github, Mbox which only enrich the data attributes mentioned above.
-
Extracted enriched data by enriching raw indexes, and executing Elastic dump on it.
-
Github:
- Grimoirelab-perceval
- Grimoirelab-ELK
-
MBox:
- Grimoirelab mailing list
-
Made an alias by the name of "all_scms" which will contain all scms enriched indexes under it.
-
Writing a scipt ES2Excel which converts aliased Enriched index to a CSV file, xls file, Airtable view.

