In the competitive landscape of retail, minuscule enhancements in sales strategies can markedly elevate financial outcomes. This study explores the efficacy of data-driven product bundling at minimarket B-70 Mart, by employing Market Basket Analysis (MBA). The research revolves around the hypothesis that strategic product bundling, informed by customer purchase patterns, can significantly increase sales. Utilizing transactional data from B-70 Mart, we conducted a quantitative analysis to identify frequently bought items together. The MBA helped decipher complex consumer buying behaviors, enabling the creation of optimized product bundles. The study's findings reveal that tailored product combinations, when aligned with consumer preferences deduced from MBA, lead to increased sales, customer satisfaction and inventory turnover. The research not only provides B-70 Mart with actionable insights into their sales strategy but also contributes to the broader field of retail marketing by validating the practical applications of MBA in product bundling. The implications of this study are manifold, offering a methodological framework for retailers to harness the power of data analytics for sales enhancement and strategic decision-making.
In the swiftly evolving retail landscape, where data proliferates as a driving force behind strategic decision-making, the need to understand and predict consumer behavior becomes paramount. This thesis embarks on an analytical journey to explore the potency of Market Basket Analysis (MBA) within the context of B-70 Mart, a local minimarket in Yogyakarta. It aims to unravel the patterns hidden within transactional data, leveraging the Apriori algorithm to craft and refine product bundling strategies that resonate with customer purchasing habits. The Introduction provides an overview of the research background, delineating the academic and commercial motivations that fuel this inquiry and sets the stage for a comprehensive exploration of data-driven retail innovation.
Literature Review
Market Basket Analysis: Market Basket Analysis (MBA) is a data mining technique used to understand customer purchase behaviour by uncovering associations and relationships between the different items that customers place in their shopping baskets. The primary goal of MBA is to identify sets of products that are frequently bought together so that retailers can use this information for various marketing and sales strategies, such as product placement, inventory management and cross-selling.
The origins of Market Basket Analysis can be traced back to the work of Agrawal et al. [1], who introduced the concept as part of the broader field of association rule learning in data mining. Since then, MBA has been widely applied in the retail sector to enhance decision-making and increase sales [2]. It operates on the principle that if customers buy a certain group of items, they are more (or less) likely to buy another group of items [3].
One of the critical applications of MBA is in the development of product bundling strategies. By analysing transactional data, retailers can identify which products are commonly purchased together and create bundles that can attract customers and incentivize increased purchase volumes [4]. For example, if bread and butter are frequently bought together, a retailer might place these items in close proximity or offer them as a discounted bundle.
In practice, MBA uses various algorithms to analyse large datasets, with the Apriori algorithm being one of the most prominent and commonly used. This algorithm identifies frequent item sets in transaction data and extends them to larger item sets as long as those item sets appear sufficiently often in the database [1].
As an analytical tool, MBA has evolved with advancements in computing power and data storage technologies, allowing for the processing of vast amounts of transaction data in real-time. This evolution has made MBA an indispensable tool in the modern retail environment [5].
In conclusion, Market Basket Analysis serves as the foundation for many of the data-driven strategies employed in retail today. It provides a scientific approach to understanding consumer behavior, which allows for the optimization of marketing and sales efforts [6] (Figure 1).
Apriori Method
The Apriori Method is a seminal algorithm in the field of data mining that serves as the backbone for discovering frequent item sets in databases and for generating association rules. It was introduced by Agrawal and Srikant in 1994, in their pioneering work on association rule mining. The significance of the Apriori Method lies in its ability to efficiently process large volumes of data to identify items that are often purchased together.
Conceptual Framework
The conceptual framework of this study can be described as Figure 2.
The conceptual framework of this thesis establishes a structured approach to understanding how data-driven techniques can be leveraged to improve product bundling and placement strategies in retail settings, specifically within B-70 Mart.

Figure 1: Market Basket Analysis Illustration

Figure 2: Conceptual Framework Diagram
Data Collection
Specifically, the study will analyze transaction data from B-70 Mart's point-of-sale system, which automatically captures every purchase detail. Then the data collected for this study will be executed using R Studio by Posit Cloud, capitalizing on its sophisticated tools for processing large datasets. The period of data collection spans from November to December 2023, a timeframe selected to provide current and actionable insights into consumer behavior. The pre-processing in R Studio will involve cleaning and anonymizing the data, ensuring its readiness for the Market Basket Analysis via the Apriori algorithm.
Samples and Population
The population under study encompasses the entirety of B-70 Mart's customer transactions within the designated two-month period. From this population, a sample of 4,800 transactional events will be analysed. This sample size is statistically significant to ensure the reliability of the MBA results and is reflective of the typical purchasing activity of B-70 Mart's diverse customer base.
Support
Support is calculated as the proportion of transactions that include a particular item or combination of items within the dataset. Formally, the support for an itemset A is defined as the ratio of transactions containing A to the total number of transactions, given by the formula:
Support (A) = Number of transaction containing A / Total number of transactions
Confidence
Confidence measures the likelihood of item B being purchased given that item A has been purchased. This conditional probability is expressed by the confidence formula for a rule A→B:
Confidence (A→B) = Support (A∪B)/Support (A)
Lift
Lift gauges the strength of a rule over the random co-occurrence of A and B, reflecting the rule's effectiveness in predicting a sale. The lift of a rule A→B is calculated as:
Lift (A→B) = Confidence (A→B)/Support (B)
A lift value greater than 1 indicates that A and B appear together more often than expected if they were statistically independent.

Figure 3: Research Design Flow Chart
These metrics will guide the identification of product pairs or groups with high transactional affinity. The process is meticulously designed to filter through noise and highlight statistically significant patterns, as described by Hahsler et al. [7], who underscore the practical importance of understanding these metrics in the context of retail data mining.
Business Solution
Two Products Bundling Analysis: This chapter will cover analysis of what could be the combination can be made for product bundling based on available data. In summary it will be 70 combinations of 2 products bundling, 3 combinations of 3 products bundling and no combination can be suggested for more than three product bundling.
There are 70 combinations of two (2) products bundling Shown in Table 1-3 and Figure 4.
Table 1: Support, Confidence and Lift of 2 Product Bundling (List 1-20)

Table 2: Support, Confidence and Lift of 2 Product Bundling (List 21-40)

Table 3: Support, Confidence and Lift of 2 Product Bundling (List 41-70)

Figure 4: Picture Above Is Network Visualization for Correlation Bundling Product for Two Products Bundling
Three Products Bundling Analysis
There are 3 combinations of two (2) products bundling Shown in Table 4 and Figure 5.
Product Bundling Analysis Based On Product Categories
Currently there is a new product of cigarettes (Rokok Juara) that expected to increase its sales by using bundling product analysis.
Table 4: Support, Confidence and Lift of 3 Products Bundling
| Rules | LHS | RHS | Support | Confidence | Coverage | Lift | Count |
| 1 | {PERMEN KACAMATA, TEH PCK HRM 350} | {YUPI ALLVAR 700} | 0.002139 | 1.000000 | 0.002139 | 51.92593 | 3 |
| 2 | {PERMEN KACAMATA, YUPI ALLVAR 700} | {TEH PCK HRM 350} | 0.002139 | 1.000000 | 0.002139 | 43.81250 | 3 |
| 3 | {TEH PCK HRM 350, YUPI ALLVAR 700} | {PERMEN KACAMATA} | 0.002139 | 0.428571 | 0.004993 | 42.91837 | 3 |
Table 5: Support, Confidence and Lift of Group Products Bundling targeting for “Rokok”
| Rules | LHS | RHS | Support | Confidence | Coverage | Lift | Count |
| 1 | {Korek gas} | {Rokok} | 0.0078459 | 0.733333 | 0.0106990 | 4.942949 | 11 |
| 2 | {Air mineral, Minuman} | {Rokok} | 0.0035663 | 0.294117 | 0.0121255 | 1.982466 | 5 |
| 3 | {Sandwich} | {Rokok} | 0.0028530 | 0.285714 | 0.0099857 | 1.925824 | 4 |
| 4 | {Biaya admin/transfer perbankan} | {Rokok} | 0.0099857 | 0.218750 | 0.0456490 | 1.474459 | 14 |
| 5 | {Kopi} | {Rokok} | 0.0057061 | 0.205128 | 0.0278174 | 1.382643 | 8 |

Figure 5: Visualization for Correlation Bundling Product for Three Products Bundling

Figure 6: Visualization for Correlation Bundling Product for Those (5) Products Bundling/Rules Correlation
Based on Market Basket Analysis, we could identify 5 rules combination that normally customer bought product/transaction prior to buy “Rokok”. Result calculation as per Table 5 and Figure 6.
Based on Analysis the data transaction of B-70 Mart can be utilized to evaluate which bundling product can be suggested to increase sales. There are 70 combination bundling products can be recommend for 2 items product bundling, while 3 bundling products can be recommended for 3 items product bundling.
Recommendation
Building on the findings we propose the following recommendations for B-70 Minimarket to enhance their sales through data-driven product bundling and strategic product placement
Implement Data-Driven Bundling: B-70 Minimarket should implement the 70 identified two-item bundles and the three-item product bundles, as these combinations have demonstrated the potential to increase sales
Strategic Placement of New Products: The introduction of 'Rokok Juara' presents an opportunity to apply the insights gained from the MBA. Aligning with the suggested product associations, 'Rokok Juara' should be placed in proximity to 'Korek gas', beverages like 'Air mineral and Minuman', 'Sandwich', 'Biaya admin/transfer perbankan' and 'Kopi'
Regular Review and Adaptation: Given the dynamic nature of consumer preferences, B-70 Minimarket should establish a routine for regular review of transaction data to update and refine bundling strategies
Staff Training and Customer Education: Employees should be trained to understand and effectively communicate the benefits of new bundling strategies to customers
Monitoring and Evaluation: Implement a robust monitoring system to track the performance of the newly implemented bundling strategies
Agrawal, R. and R. Srikant. “Fast Algorithms for Mining Association Rules.” Proceedings of the 20th International Conference on Very Large Data Bases (VLDB), vol. 1215, no. 12, 1994, pp. 487–499.
Chen, Y.L. et al. “Market Basket Analysis in a Multiple Store Environment.” Decision Support Systems, vol. 40, no. 2, 2005, pp. 339–354.
Hastie, T. et al. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer Series in Statistics, 2009.
Russell, S.J. and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, 2010.
Kumar, V. et al. “Future of Retailer Profitability: An Organizing Framework.” Journal of Retailing, vol. 88, no. 1, 2012, pp. 1–18.
Smith, A. and J.N.D. Gupta. “Neural Networks in Business: Techniques and Applications for the Operations Researcher.” Computers and Operations Research, vol. 27, no. 11–12, 2000, pp. 1023–1044.
Hahsler, M., B. Grun and K. Hornik. “arules – A Computational Environment for Mining Association Rules and Frequent Item Sets.” Journal of Statistical Software, vol. 14, no. 15, 2005.