Association between variables

[Hi,
Congrats for creating a nice and quick tool at having a glance at data. I particularly liked ‘compare data’ features.
I tried using Sweetviz on Titanic train & test data sets (from Kaggle) but I have some doubts/issues if you can answer.

  1. How the association between the variables are measured?
    I believe for both numerical variables, it’s Pearson’s correlation coefficient. However, the report shows it is ‘0’ for age-age [while it should be 1]
    Also, ‘P-class’ and ‘Age’ are negatively associated but the report shows a positive association.
    And for both categorical variables, is it measured by Cramer’s V? I didn’t find the details in the document. How about association between one numerical & one categorical variable.Correlation using pandas df.corr()

(http://)Sweetviz report for Titanic Train Data

Hello Nisha,

I believe your message got cut-off, so I only see your question 1. For information on associations, check out the github page, section “Special thanks & related materials”. It gives links on the inspiration for the associations, which were taken from here:

Shaked Zychlinski: The Search for Categorical Correlation

I hope to hear about your other questions soon!

Please note that I will have limited access to the Internet for the next few weeks so I may not answer until later on, but please let me know any thoughts and I will get back to them for sure!

Francois