forked from Rspawar/Data-Science-with-R
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathProject Proposal.Rmd
100 lines (69 loc) · 5.14 KB
/
Project Proposal.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
---
Project title: "Data Science with R Project Proposal"
output:
rmarkdown::html_document:
theme: lumen
bibliography: bibliography.bib
---
<style>
body {
text-align: justify}
</style>
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
### Project Proposal on Customer Behavioural Analytics in the Retail sector
<br>
__Project title:__ <font color ="black"> "Customer Behavioural Analytics in the Retail sector" <br /> </font> </font>
__Names of team members:__<br />
<font color ="black">
1. Nadiia Honcharenko (220681) <br />
2. Rutuja Shivraj Pawar (220051, rutuja.pawar@ovgu.de) <br />
3. Shivani Jadhav () <br />
4. Sumit Kundu ()
</font>
__Under the Guidance of:__ <font color ="black"> M.Sc. Uli Niemann </font>
__Date:__ <font color ="black"> ```r format(Sys.Date(), "%B %e, %Y")``` </font>
__Background and motivation:__ <font color ="black">
</font>
__Project objectives:__ <font color ="black"> A customer is a key-centric factor for any business to be successful. Effectively measuring and modeling customer behaviour by understanding what matters the most to them thus devising appropriate strategies can help to enhance the overall customer experience. This eventually helps in the long run towards customer retention and a sustainable growth of the business. Hence, _Understanding the Customer Behavioural Pattern in a Business_ is the crucial problem to be addressed. This project thus aims to address the problem of understanding customer behaviour in the retail sector.<br />
The project intents to discover different analytical insights about the purchase behaviour of the customers through answering the below formulated Research Questions, <br />
__1. Are customers willing to travel long distances to purchase products in spite of the high average product price in a shop?__ <br />
_Relevance:_ This will help to understand whether the price is an important factor affecting the majority of customers purchase decisions. <br />
__2. Which are the products for which customers are ready to travel long distances and for which products they select the closest shop?__ <br />
_Relevance:_ This will help to understand the nature of the products in the context of proximity. It is assumed that customers will select closest shops to buy daily products like grocery but may travel long distances to buy one-time-purchase products like kitchen equipment. As Data Science is results-driven and not based solely on intuition, this question can help to verify this assumption.<br />
__3. What is the maximum likelihood of a customer to select a particular shop to purchase a particular product?__ <br />
_Relevance:_ This will help to understand that which shops in the retail chain are in demand for a particular product. This can further facilitate better stock management to meet customer demands.<br />
__4. What is the ranking of the shops in terms of attracting more customers and thus generating more revenue and what is their individual customer base?__<br />
_Relevance:_ This will help to understand the most popular shops in the retail chain and target different shop-level marketing schemes to the appropriate customers through customer segmentation. <br />
__5. Which are the customers that are most profitable in terms of revenue generation?__<br />
_Relevance:_ This will help to understand the customers with potential high Customer Lifetime Value and target appropriate loyalty programs to generate satisfied loyal customers as advocates for the business.
</font>
__Name of the dataset to be used:__ <font color ="black"> Supermarket aggr.Customer^[https://bigml.com/user/czuriaga/gallery/dataset/5559c2c6200d5a6570000084] <br />
The dataset to be used is the retail market data of one of the largest Italian retail distribution company called _Coop_ for a single Italian city [@pennacchioli2013explaining].<br />
The Supermarket aggr.Customer dataset used for the analysis contains data aggregated from the original datasets^[http://www.michelecoscia.com/?page_id=379] [@pennacchioli2013explaining] and mapped to new columns. The dataset thus contains 40 features with 60,366 instances and is approximately 14.0 MB in size. </font>
__Design overview:__ <font color ="black"> </font>
__Time plan:__ <font color ="black"> </font>
__GitHub Repository:__ https://github.com/Rspawar/Data-Science-with-R.git
__References:__
---
#Comments
# RQ3
# This requires association algorithm (maximum likehood analysis) to predict that a customer will select this particular shop always to purchase a particular product with a given ID (Calculated based on the maximum number of customers visiting a particular shop to buy a particular product)
# Dont display warnings
# {r, warning = FALSE}
# Dont display code
# {r, echo=FALSE}
# To store long computation results in a local cache
# {r, cache=TRUE}
# Exmaple usage of labelling and reusing code chunks
# ```{r iris_plot, echo = FALSE, eval = FALSE}
# library(ggplot2)
# ggplot(iris, aes(x = Species, y = Sepal.Length)) + geom_boxplot()
#```
# ```{r , ref.label='iris_plot', fig.width = 7, fig.height = 2.5}
# ```
# Inline R code usng r
# A random sample of 5 numbers from the set of numbers between
# 1 and 10 is `r sample(1:10, 5)`
---