Topic > The Information Age: Data Mining - 1095

Chapter 1Introduction1.1 BackgroundIn the information age, a lot of data is generated from everywhere. Together with the advent of IT tools, therefore, all data is collected and awaits to be converted into information and knowledge. Therefore, the information industry provides useful information in many areas such as market analysis, science, decision making and customer relations. Data mining is the integration between analytical techniques and database system. Previously, it only had database queries, data processing, or transactional processing, which is not enough to allow users to understand all the data at once. They cannot answer complex questions like what are the relationships between items in the database. The answers to these questions are more valuable to people. User needs far exceed the capabilities of the database management system due to the huge amount of data, so it is necessary to uncover hidden patterns and knowledge. Unfortunately, human capabilities are limited and people are not capable of understanding a very large data set on their own. Therefore, powerful tools are invented to help people analyze large data. If there are no powerful tools then the huge amounts of data are just garbage because no one would want to investigate them. To discover hidden patterns or useful information from huge amounts of data there is a process called “Data mining”. In the database there are associations when many elements are presented at the same time. The relationships between the elements could represent some interesting results. For example, items purchased together might represent customer behavior, and patients who have flu and fever should have a cough. Therefore, the information that comes from the products in the store, but also talks about the events in some situations. I only focus on the product side, how market basket analysis is implemented in retail store databases. On the other hand, data mining is also a broad area. It is the process of extracting useful information, i.e. correlations and patterns, from a huge data set. The result could answer business questions, which usually take a long time to answer. I'm only talking about the algorithm, which is related to market basket analysis, especially the Apriori algorithm. Additionally, there are tools that are analytical tools for data mining. This research would only talk about Weka software as a tool for analyzing sales data of retail outlets. I discuss how to use Weka with sales data to find useful information for business, also how to interpret Weka result for business purposes.