deirokay.profiler.profile
- deirokay.profiler.profile(df: DeirokayDataSource, document_name: str, save_to: Optional[str] = None) Dict[str, Union[str, List[Dict[str, Union[str, List[Dict[str, Any]]]]]]][source]
Generate a validation document from a given template DataFrame using profiling methods for builtin Deirokay statements. By default, statement objects are generated for the entire template DataFrame (the entire set of columns), and then for each of its columns individually. This function should be used only as a draft for a validation document or as a means to quickly launch a first version with minimum efforts. The user is encouraged to correct and supplement the generated document to better meet their expectations.
- Parameters
df (DataFrame) – The DataFrame to use as template, ideally parsed with Deirokay data_reader.
document_name (str) – The validation document name.
save_to (Optional[str], optional) – Path (lcaol or S3) where to save the validation document to. The file format is inferred by the its extension. If None, no document will be saved. By default None.
- Returns
The auto-generated validation document as Python dict.
- Return type
dict