niklib.data package
Submodules
niklib.data.constant module
- niklib.data.constant.EXAMPLE_FINANCIAL_RATIOS = {'deposit2rent': 0.03, 'deposit2worth': 5.0, 'income2tax': 0.15, 'income2worth': 15.0, 'rent2deposit': 33.333333333333336, 'tax2income': 6.666666666666667, 'worth2deposit': 0.2, 'worth2income': 0.06666666666666667}
Ratios used to convert rent, deposit, and total worth to each other
Note
This is part of dictionaries containing factors in used in heuristic calculations using domain knowledge.
- Info:
Although this is created as an code example, values chosen here are from basic rule of thump and actually can be used if no other reliable information is available.
- class niklib.data.constant.ExampleFillna(value)[source]
Bases:
CustomNamingEnumValues used to fill
Nones depending on the form structureMembers follow the
<field_name>_<form_name>naming convention. The value has been extracted by manually inspecting the documents. Hence, for each form, user must find and set this value manually.Note
We do not use any heuristics here, we just follow what form used and only add another option which should be used as
Nonestate; i.e.Noneas a separate feature in categorical mode.- CHD_M_STATUS_5645E = 9
- class niklib.data.constant.ExampleDocTypes(value)[source]
Bases:
CustomNamingEnumContains all document types which can be used to customize ETL steps for each document type
Members follow the
<country_name>_<document_type>naming convention. The value and its order are meaningless.- CANADA = 1
- CANADA_5257E = 2
- CANADA_5645E = 3
- CANADA_LABEL = 4
- class niklib.data.constant.ExampleMarriageStatus(value)[source]
Bases:
CustomNamingEnumStates of marriage in (some specific) form
Note
Values for the members are the values used in original forms. Hence, it should not be modified by any means as it is tied to dataset, transformation, and other domain-specific values.
- Info:
These values have been chosen for demonstration purposes in this class and and do not carry any meaning or information (El No Sabe). But for real world, you must use meaningful ones.
- COMMON_LAW = 69
- DIVORCED = 3
- SEPARATED = 4
- MARRIED = 0
- SINGLE = 7
- WIDOWED = 85
- UNKNOWN = 9
- class niklib.data.constant.ExampleSex(value)[source]
Bases:
CustomNamingEnumSex types in general
Note
The values of enum members are not important, hence no explicit valuing is used
- Info:
The name of the members has to be customized because of bad preprocessing (or in some cases, domain-specific knowledge), hence,
namehas been overridden.
- FEMALE = 1
- MALE = 2