Assortment on the Bases of Big-Data Analytics: A Quantitative Analysis on Retail Industry


Benazir Bhutto Shaheed University Lyari, Karachi, Pakistan
Khadim Ali Shah Bukhari Institute of Technology, Karachi, Pakistan
Karachi University Business School, University of Karachi, Karachi, Pakistan

Abstract

Big-data analytics are treated as the future of technology and essential for the retail sector. The technology is especially beneficial for optimising daily operations and supply chain practices. However, there is a wide gap in the related literature in developing and Asian countries on this subject. On the other hand, retailing is one of the fastest-growing industries globally. Therefore, this study is specifically designed to understand the role of big data concerning the organised retail sector of Pakistan. The study’s primary objective is to assess the significance of technology in augmenting assortment strategies. However, the mediation of advanced algorithms and moderation of skilled data scientists are included in the research construct to increase research relevance to the pragmatic world. Results were determined by applying Partial least square structured equation modeling (PLS-SEM). The findings indicated that big data is a prolific constituent to optimise assortment in the retail sector of Pakistan. However, the technology would not produce the desired results without applying advanced algorithms. This study accentuates the actuality that advanced algorithms are essential to be analysed to use big-data most effectively to retrieve new information. Further studies may also be conducted in devising a comprehensive model which includes all the potent variables associated with store-layout design.

Keywords

technology, retail, assortment strategies, big data analytics

INTRODUCTION

The era of big-data is accompanied by social media, cell phones, the internet of things, e-commerce, and search engines. Thus, data stored in the IT systems is getting doubled rapidly, and this increase has been majorly caused by the massive amount of data created by organisations (Surbakti, Wang, Indulska, & Sadiq, 2020). Previous research highlights that this form of technology may be excessively beneficial for retailers and may aid in profitability; especially during volatile economic conditions, the technology may act as a significant resource and opportunity to fashion retailers (Silva, Hassani, & Madsen, 2020). The use of big-data is recommended for organisational level benefits, especially for banking, retail, insurance, and government sectors (Yadav, Wang, & Kumar, 2013).

The postulate looks valid as retailers evidence a large stream of customer visits daily. And hence need a unified picture of customers’ preferences, especially for making strategic decisions (Butt, Suroor, Hameed, & Mehmood, 2021; Chauhan, Mahajan, & Lohare, 2017). Well-Known brands use the technology to analyse consumer demand patterns and assist companies in assessing adequate stock, price, colours, design, etc. (Oosthuizen, Botha, Robertson, & Montecchi, 2020). Thus, the increase in demand for data-management experts is legitimate, and therefore organisations are striving to increase their budgets to recruit skilful data analysts (Surbakti et al., 2020). Though there are several types of data for, e.g., consumer review, search data, shopping baskets, etc., to perform research extensively and effectively, there is a need to use shopping basket data (Tharwat & Gabel, 2019).

Limited studies may provide insights regarding this beneficial technology concerning Pakistan, and hence the country is experiencing a significant lacking of firms that may incorporate big-data. Still, this lacklustre approach is due to insufficient technology and cultural differences concerning the western world (Shaikh, Sultan, Khaskhelly, & Zehra, 2021). On the other side, analysis of big-data resulted in a correlation of several variables, although correlation does not mean causative relationships and therefore making strategic decisions is not easy (Blackburn, Alexander, Legan, & Klabjan, 2017). However, to assess anomalies in consumption patterns, it is legitimate to consider the use of advanced algorithms for retailers while using big-data analytics (Silva et al., 2020). One must also not ignore the complexity of analysis based on big-data. All the correlations are not significant (Chauhan et al., 2017); hence the skill set of big-data analysts is also much more critical (Surbakti et al., 2020). Thus, this study uses advanced algorithms as a mediator by considering Silva, Khan, and Han (2017), which was not indulged in previous studies. However, without the inclusion of the moderation of skilled data-scientist, the study could not be compelling, as indicated by Surbakti et al. (2020). The study is one of the studies aiming to highlight the significance of big-data, which is termed the future of technology (Katal, Wazid, & Goudar, 2013).

The country’s lack of research and application has been in observation for a long time (Shaikh et al., 2021). Therefore, the significance of the study has several folds as this study will not only increases research work on big-data analytics regarding Pakistan but will also relate the work to the industry, which is perceived as the best for the application of big-data (Dekimpe, 2020). Moreover, the study is also related to the store layout design, one of the most critical sides of retail business and significantly impacts consumer buying behaviour (Aktas & Meng, 2017). Last but not least, to understand the significance of the study, one must consider the significant amount of contribution of the retail sector to the economy of Pakistan (Ahmed, Ullah, & Paracha, 2012). Hence, the study is pervasive and can provide valuable insights to researchers, academicians, and managers better to understand the benefits and implications of the technology. 

LITERATURE REVIEW

Big-Data and its Significance for the Retail Sector

The heart of retailing lies in understanding consumers’ actual consumption as creating a better customer experience is desirable to differentiate retailers from others. However, these elements include some control able and some uncontrollable elements (Grewal & Roggeveen, 2020). Therefore, big-data seems vital for retail business, but its importance for marketing and related activities is still under observation (Sultan, Shaikh, & Imtiaz, 2021). The use of big-data as the assessment tool has also been supported by using technology. The importance of customers’ experience is a well-known point by the industry and academia (Grewal & Roggeveen, 2020). However, for pursuing retail business, there is always a need for knowledge associated with customers’ demographics to devise more effective, targeted, and relevant communication programs and better supply chain operations (Ahmed, 2020; Chauhan et al., 2017). Therefore, the role and importance of technology in planning are inevitable.

In this regard, big-data may help retailers assess anomalies by observing practices and patterns that look unusual (Santoro, Fiano, Bertoldi, & Ciampi, 2019). The statement is valid as research companies utilised previous data from store scanners and panel data for household purchases to formulate adequate research studies (Dekimpe, 2020). However, in recent times’ surveillance of critical characteristics like the quantity of purchase, frequency of purchase and location of the outlet may result in effective outcomes through applying big-data analytics. However, chances of failure are still there as every retailer does not effectively assess consumer purchase patterns.

Big-Data on Assortment Strategies of Retail Sector

Sales of the Retail segment has been significantly relying on two main aspects, including e-marketing practices (Khurram, Sultan, & Turi, 2019) from the demand side and data handling and assortment strategies from the supply side. From the supply-side perspective, the emergent technological advancements make assortment one of the most critical challenges for the retail segment (Aktas & Meng, 2017). Companies are using advanced technologies to understand individual consumer behaviour to relate the findings based on individual behaviour. This knowledge will aid companies in making micro-segmentation at granular levels (Sanders, 2016). The statement is valid as numbers of SKUs, and related information will add rows and columns to the data sets (Bradlow, Gangwar, Kopalle, & Voleti, 2017). Big-data allows firms to assess real-time information from the point of sales (PoS) to improve daily decision making associated with inventory level, re-stocking and re-stocking etc. (Bradlow et al., 2017). Therefore, big-data is the tool that may aid retailers in-stock products with high demands or sales potential (Lekhwar, Yadav, & Singh, 2019). Similarly, it is legitimate to reflect big-data as the tool that may help the retail sector devise better assortment strategies by making correlational analyses for the items purchased associated with time and location (Aktas & Meng, 2017).

Skillset of Data-Scientists

Timeliness is one of the significant challenges for the retail sector. There is a need to implement effective strategies within an optimal time frame to better retail operations & practices (Aktas & Meng, 2017). Therefore, the skills of data-scientists are potent to be considered in association with taking desirable advantage of the big-data analytics. However, inadequate staffing is also a concern as companies have to incur a significant expense to bear the leverage of skilful in-house data management professionals (Santoro et al., 2019).

Use of Advanced Algorithms

There is a need to extract complex information hidden in complex data sets, and therefore, the interrelationship of data is mandatory to extract the relevant and fruitful information (Yadav et al., 2013). However, the best use of big-data analytics is on day-to-day operations (Oosthuizen et al., 2020), which is also not free from anomalies of relationships. Thus, companies must try to identify those situations where the communication looks most relevant (Chauhan et al., 2017) to relate communications with advanced algorithms to have desired and adequate results regarding customer preferences (Ahsan, Hon, & Albarbar, 2020).

RESEARCH MODEL AND HYPOTHESIS

https://typeset-prod-media-server.s3.amazonaws.com/article_uploads/dee4fcc9-645f-41fb-939a-51df9ba63656/image/a18e6f8d-b5bc-4bba-835f-f85c5b1054d3-upicture3.png
Figure 1: Research Model

H1 There is a relationship between the use of big-data and the application of advanced algorithms.

H2 There is a relationship between the use of advanced algorithms and the optimisation of assortment strategies in the organised retail sector.

H3 Advanced Algorithms does not mediate the relationship of big-data and optimisation of assortment strategies.

H4 Skilled Data–Scientist does not moderate the relationship of advanced algorithms and optimisation of assortment strategies for organised retail segment.

RESEARCH METHODOLOGY

Research Methodology is a general set of assumptions used to conduct a study in the desired manner (Long, 2014), which interrelates all the parameters to indicate coherence between all the study aspects (Brannick & Roche, 1997). However, the parameters provided by Saunders, Lewis, Thornhill, and Wilson (2009) are found to be much important in formulating a practical set of strategies to conduct research. Therefore, the parameters have also been followed in this study.

Research Design

Research Design is the set of strategies applied to answer the research questions through relating assumptions with the research findings (Kothari, 2004). Therefore, following Saunders et al. (2009), this study uses epistemology as the research philosophy. The purpose is to increase knowledge with the reference of Pakistan and the organised retail sector. Hence, it is also accompanied by post-positivism as the research stance (research paradigm) indicated byZukauskas, Vveinhardt, and Andriukaitienė (2018) to relate the work with quantitative techniques.

Sampling Design

The research uses the reference ofAktas and Meng (2017) to collect data from the industry expert through incorporating quota sampling as indicated by Yang and Banamah (2014). Due to the incorporation of several unexplored variables, the study’s approach is theory-building, which is consistent with Pathirage, Amaratunga, and Haigh (2008), and hence may work with lesser sample size, i.e., 100 respondents. The respondents are from the IT department of organised retail, which mainly deals with FMCG items. These considerations are made to extend the work of Shaikh et al. (2021), which does not relate to the use of advanced algorithms in the study.

Software and Statistical Technique

There are several reasons to incorporate smartPLS (Hair, Ringle, & Sarstedt, 2013). However, there are two common and preferred reasons to comply with it: the lesser size of the sample and the non-normal distribution (Chin, 1998). The statistical technique used for the analysis is structural equation modeling, and smartPLS is suited best for incorporating this statistical technique (Hair, Risher, Sarstedt, & Ringle, 2019; Richter, Sinkovics, Ringle, & Schlägel, 2016)

STATISTICAL ANALYSIS

According to the rule of thumb, outer loading values for all the indicators must be higher than or equal to 0.708 (Hair et al., 2013) & values lesser than 0.70 must be excluded, significantly when the exclusion aids the overall reliability of the construct (Hair et al., 2013).

However, in the case of exploratory analysis, outer loadings of 0.60 or above are treated as potent enough to be included in the construct (Afthanorhan, 2014). Therefore, in the light of these parameters, all the elements (indicators) used in Table 1 are potent enough to be included in this research.

Table 1: Outer Loadings

Advanced Analysis Algorithms

Assortment Strategies

Big Data

Moderating Effect SDS

Skilled Data Scientists

AAA1

0.728

AAA2

0.815

AAA3

0.763

AAA5

0.624

AS1

0.810

AS2

0.774

AS3

0.697

AS4

0.759

AS5

0.724

BD1

0.793

BD2

0.815

BD3

0.757

BD4

0.636

BD*SDS

1.147

SDS1

0.674

SDS2

0.689

SDS3

0.723

SDS5

0.707

SDS6

0.668

SDS7

0.715

R-Square (Predictive Accuracy) explains the portion of the variance associated with endogenous variables explained by the structural model (Ringle, Silva, & Bido, 2015). For reflective models, the least acceptable value to reflect the impact of the independent variable is 0.25. At the same time, 0.50 is termed as moderate, and 0.75 or above are treated as substantial for highlighting the impact (Hair et al., 2013). Therefore, in the light of these parameters, all the relations highlighted through Table 2 seems to have a moderate fit and hence appropriate enough to reflect the variance caused by independent variables (Benitez, Henseler, Castillo, & Schuberth, 2020).

Table 2: Quality Criteria (Predictive Accuracy)

R Square

R Square Adjusted

Advanced Analysis Algorithms

0.704

0.681

Assortment Strategies

0.649

0.632

https://s3-us-west-2.amazonaws.com/typeset-prod-media-server/9be921e6-dbe4-4c42-8b4f-8bf07becd53eimage2.png
Figure 2: Confirmatory Factor Analysis

Confirmatory Factor Analysis and Outer Loadings

Table 3 indicates Convergent Validity and Construct Reliability (Ringle et al., 2015). AVE measures convergent validity, which measures the extent to which the variable is correlated with the construct (Benitez et al., 2020).

Therefore, the values of 0.5 or above are treated as the confirmation of convergent validity (Ringle et al., 2015). Similar is highlighted by Table 3, where AVE is more than 0.5 in each case, thus ensuring convergent validity. On the other side, table 3 also indicates construct reliability through Cronbach’s Alpha (α), Dillon-Goldstein’s rho and composite reliability. The construct reliability is appropriate enough as all the values reflecting construct reliability are higher than 0.7 (Sijtsma, 2009; Sijtsma, 2009).

Table 3: Construct Reliability and Convergent Validity

Cronbach’s Alpha

Composite Reliability

Average Variance Extracted (AVE)

Advanced Analysis Algorithms

0.737

0.824

0.542

Assortment Strategies

0.809

0.868

0.568

Big Data

0.745

0.839

0.568

Moderating Effect SDS

1.000

1.000

1.000

Skilled Data Scientists

0.790

0.849

0.618

Table 4: Discriminant Validity (Heterotrait-Monotrait Ratio)

AAA

AS

BD

BD*SDS

AAA

AS

0.459

BD

0.394

0.489

BD*SDS

0.274

0.136

0.115

SDS

0.343

0.681

0.488

0.122

The purpose of using HTMT as a tool for discriminant validity is to highlight the difference among the variables of the same construct through numeric values (Cheung & Lee, 2010). As Fornell and Larcker (1981), criterion and cross-loadings seem to be less effective measures than the HTMT. HTMT is used to predict collinearity problems in latent construct. Thus, legitimate to declare HTMT as the tool used to assess multi-collinearity (Hamid, Sami, Sidek, & M, 2017). The results must have lower values than 0.85 as values higher than 0.85 indicate lacking discriminant validity (Hair, Sarstedt, Ringle, & Gudergan, 2017). Therefore, according to Table 4, there is no value conflicting with the above mentioned criteria. Therefore, there is no case of multi-collinearity detected in the model.

Table 5: Path Coefficients

Original Sample (O)

Standard Deviation

T Statistics

P Values

Advanced Analysis Algorithms -> Assortment Strategies

0.219

0.052

4.211

0.000

Big Data -> Advanced Analysis Algorithms

0.322

0.047

6.797

0.000

Big Data -> Assortment Strategies

0.153

0.052

2.952

0.003

Moderating Effect SDS -> Assortment Strategies

-0.025

0.047

0.530

0.596

Skilled Data Scientists -> Assortment Strategies

0.441

0.047

9.397

0.000

Table 5 is used to indicate the impact of latent variables on each other through inferential statistics. This is treated as one of the significant tools for assessing reflective measurement models used in smartPLS. The tool uses t-statistics and p-values to redress. The impact and fulfillment of both of these criterions’ is a must for appropriate analysis. The maximum value to which one may leverage the p-value is 0.05 above that there would be no relationship among the latent variables. Similarly, the minimum t-statistics to assure the relationship is 1.97, and lower than this value, no value may assure the relationship between the latent variables (Hair, Hult, Ringle, & Sarstedt, 2016).

Table 5 also indicated those skilled data scientists are the critical predictor of assortment strategies. However, the moderation of skilled data scientists for skilled data-scientist is not perceived worthy in optimising assortment strategies. This might be caused as the FMCG sector mostly have entire demand, and therefore, the assortment is majorly affected by the use of advanced algorithms rather than the skills of data scientists.

Therefore, in the light of these parameters, it has been indicated that the big-data is perceived as an effective tool for optimising assortment strategies in the organised retail sector of Pakistan. However, big-data cannot be effectively used without appropriate knowledge of advanced algorithms. Similarly, the mediation of advanced algorithms is potent in applying big-data on the assortment strategies of the retail sector(Figure 2).

https://s3-us-west-2.amazonaws.com/typeset-prod-media-server/9be921e6-dbe4-4c42-8b4f-8bf07becd53eimage3.png
Figure 3: Discriminant Validity

DISCUSSION

The findings of this study can be validated and justified through the consistency of results with previous study findings. This study is consistent with all the significant literature contributors in big data and its implications to developing countries and the retail sector. The study postulations are valid as the study is found to be coherent with the findings ofDekimpe (2020) as the study was found to be impactful on the retail sector of Pakistan. Moreover, the study signifies the use of big-data on the assortment strategies of the retail sector and therefore signifies the relationship between assortment strategies and employment of big-data as validated by Silva et al. (2020). However, some divergence is found compared to the indication that anomalies exist, and all the correlations are not like each other. Though, the moderation of skilled data-scientists is not found potent for assortment strategies for the organised retail sector of Pakistan. Hence, the findings differ from previous studies which assert that skillful data scientists are essential in the implementation of assortment strategies in the retail sector. Similarly, it is not improper to highlight the study findings which highlighted the high number of data-scientists accompanied by one of the leading US-Based online retailers employed to analyse customer demand patterns through advanced algorithms. However, these changes might be due to the change of culture and a lack of understanding regarding the technology. Besides, this study accentuates the actuality that advanced algorithms are essential to be analysed to use big-data most effectively to retrieve new information (Chauhan et al., 2017).

FUTURE RESEARCH

The study is one of the initial studies which have taken the research intent through incorporating all the significant variables associated with the application of big-data in the retail sector. However, the sample size was much smaller due to a lack of understanding of the technology and its implications. Hence, assortment strategies are the only variable used to understand the impact. However, some other potent variables are included in-store layouts like aisle design, etc. Therefore, further studies may also be conducted in devising a comprehensive model which includes all the potent variables associated with store-layout design. Moreover, the comparative studies on online and organised retail segments may also be fruitful for increasing knowledge and understanding of the technology concerning Pakistan.

ACKNOWLEDGEMENT

The authors acknowledge the help provided by the women wing of the Karachi Chamber of Commerce and Industry (KCCI), who helped link the researchers with the experts in the retail sector industry of Pakistan. The authors also thank all the respondents for their participation in data collection.