You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fromdfcorrs.cramersvcorrimportCramersimportpandasaspdcramers=Cramers()
data=pd.read_csv(r'../adatasetwithlotsofcategoricalandcontinuousfeatures.csv')
cramers.corr(data)
""" cramer's v corr comparison between all categorical features returns a Pandas datframe similar to .corr()"""cramers.corr(data, plot_htmp=True)
"""plots correlaton heatmap using plotly"""cramers.corr(data)[#feature_name]"""single out a categorical feature and observe correlations, returns Pandas Series"""
At times, a sparse/categorical feature might be falsely interpreted by Pandas as a continuous feature by default (Example: 'City Code', 'Candidate ID') and vice-versa. Hence, to solve that problem :
For custom adding categorical columns for cramers corr comparison use:
cramers.corr(data, add_cols=['feature_name'])
""" added column should be present in the dataset provided kindly use .astype('str') to force-convert falsely identified continuous columns (if any) before using."""
For custom removing categorical(or redundant) columns for cramers corr comparison, use:
cramers.corr(data, rem_cols=['feature_name'])
If you want to use the wrapper for single-shot cramer's v correlation on two python arrays or two separate pandas dataframe column-objects:
"""single-shot operation, does not remapafter applying operatio on the entire dataframe"""cramers.cramers_v(data['feature_name1'], data['feature_name2'])
cramers.cramers_v([iforiinsomeclasses1], [iforiinsomeclasses2]) #say, we have two python arrays/lists instead