Add date/time data type assessment to data profile

Team - Fantastic work. One feature that (I think) would help sweetviz take off as a baseline EDA capability is assessing/profiling data/time data types differently than object data types. One major function of EDA is understanding the span of time a data set covers (potentially to identify gaps in data). If you treat the date/time values as numbers (e.g., UTC or MS Excel serial numbers) you can calculate the same insights for date/time data as you do numerical data (i.e. convert to a numeric value for the calculations then convert back to date/time values where it makes sense). Plotting the distribution of these data on a timeline would be helpful. More advanced plotting could include distributions of data aggregated by time filters (e.g., day of week, month of year, hour of day, etc.).

I’m happy to bounce ideas around with you.

-Ryan

Hello Ryan,
Thank you for the good words and detailed feedback. Apologies for the long delay in answering since I missed your initial post for some reason then I just haven’t been able to put time towards the project in the last couple of weeks.

Date/time has definitely been high on the list of things to process better (alongside with text data), however adding a whole new data type adds exponential complexity (every type must be plotted against every other, e.g. bool-bool, bool-numerical, bool-categorical, etc.) so I have not had the time to add this yet.

Your approach makes sense, I was wondering if this is something you would be able to do as a pre-processing step, e.g. convert to UTC, then proceed as numerical.

Perhaps this is something you are already doing and would like to library to do up front? That would make it quicker integration (just a conversion step).

Thanks again,
Francois