[ Pobierz całość w formacie PDF ]
TABLE 4.5 Database for Exercise 3
TID Items
T01 Cheese, Milk, Egg
T02 Apple, Cheese
T03 Apple, Bread, Cheese, Orange, Grape
T04 Bread, Egg, Orange
T05 Cheese, Milk, Grape
T06 Apple, Cheese, Egg, Orange
T07 Bread, Cheese, Orange
T08 Cheese, Egg, Grape
T09 Bread, Cheese, Egg, Grape
T10 Bread, Cheese, Grape
© 2009 by Taylor & Francis Group, LLC
References 89
TABLE 4.6 Sequence Database
for Exercise 10
SID Transaction Sequences
S01 (bc)(d)(ab)(def )
S02 (abc)(cf )(df )
S03 (ce f )(df )(ab)( f )
S04 (be)(ac)(cd f )
10. Given the sequence database shown in Table 4.6, find frequent sequential pat-
terns by AprioriAll for minsup = 0.5.
References
[1] http://www.borgelt.net/apriori.html.
[2] http://www.cs.bme.hu/
[3] R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In
Proc. of the 20th International Conference on Very Large Data Bases (VLDB
1994), pages 487 499, 1994.
[4] R. Agrawal and R. Srikant. Fast algorithms for mining association rules. IBM
Research Report RJ9839, IBM Research Division, Almaden Research Center,
1994.
[5] R. Agrawal and R. Srikant. Mining sequential patterns. IBM Research Report
RJ9910, IBM Research Division, Almaden Research Center, 1994.
[6] R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. of the 11th
International Conference on Data Engineering (ICDE 1995), pages 3 14, 1995.
[7] T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Sakamoto, and S. Arikawa.
Efficient substructure discovery from large semi-structured data. In Proc. of
the 2nd SIAM International Conference on Data Mining, pages 158 174, 2002.
[8] C. Blake and C. Merz. UCI repository of machine learning databases, 1998.
http://www.ics.uci.edu/~mlearn/MLRepository.html.
[9] F. Bodon. Surprising results of trie-based fim algorithms. In Proc. of the IEEE
ICDM Workshop on Frequent Itemset Mining Implementations (FIMI 04),
volume 126 of CEUR Workshop Proceedings, 2004. http://ftp.informatik.
rwth-aachen.de/Publications/CEUR-WS/Vol-126/bodon.pdf.
[10] C. Borgelt. Efficient implementations of Apriori and Eclat. In Proc. of the IEEE
ICDM Workshop on Frequent Itemset Mining Implementations (FIMI 03),
© 2009 by Taylor & Francis Group, LLC
90 Apriori
volume 90 of CEUR Workshop Proceedings, 2003. http://ftp.informatik.
rwth-aachen.de/Publications/CEUR-WS/Vol-90/borgelt.pdf.
[11] S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and
implication rules for market basket data. In Proc. of ACM SIGMOD Interna-
tional Conference on Management of Data (SIGMOD 1997), pages 255 264,
1997.
[12] D. W. Cheung, J. Han, and C. Y. Wong. Maintenance of discovered association
rules in large databases: An incremental updating technique. In Proc. of the
1996 ACM SIGMOD International Conference on Management of Data, pages
13 23, 1996.
[13] D. J. Cook and L. B. Holder. Substructure discovery using minimum description
length and background knowledge. Journal of Artificial Intelligence Research,
Vol.1, pages 231 255, 1994.
[14] G. Dong, X. Zhang, L. Wong, and J. Li. Caep: Classification by aggregating
emerging patterns. In Proc. of the 2nd International Conference on Discovery
Science (DS 99), LNAI 1721, Springer, pages 30 42, 1999.
[15] B. Goethals. Survey on frequent pattern mining, 2003. http://www.adrem.
ua.ac.be/bibrem/pubs/fpm survey.pdf
[16] J. Han, H. Cheng, D. Xin, and X. Yan. Frequent pattern mining: Current status
and future direction. Data Mining and Knowledge Discovery, Vol. 15, No. 1,
pages 55 86, 2007.
[17] J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate gen-
eration. In Proc. of the 2000 ACM SIGMOD International Conference on
Management of Data, pages 1 12, 2000.
[18] J. Han, J. Pei, Y. Yin, and R. Mao. Mining frequent patterns without candidate
generation: A frequent-pattern tree approach. Data Mining and Knowledge
Discovery, Vol. 8, No. 1, pages 53 87, 2004.
[19] A. Inokuchi, T. Washio, and H. Motoda. General framework for mining frequent
subgraphs from labeled graphs. Fundamenta Informaticae, Vol. 66, No. 1-2,
pages 53 82, 2005.
[20] K. Kailing, H. Kriegel, and P. Kroger. Density-connected subspace clustering
for high-dimensional data. In Proc. of the 4th SIAM International Conference
on Data Mining, pages 246 257, 2004.
[21] W. Li, J. Han, and J. Pei. Cmar: Accurate and efficient classification based
on multiple class-association rules. In Proc. of the 1st IEEE International
Conference on Data Mining (ICDM 01), pages 369 376, 2001.
[22] B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule
mining. In Proc. of the 4th International Conference on Knowledge Discovery
and Data Mining (KDD-98), pages 80 86, 1998.
© 2009 by Taylor & Francis Group, LLC
References 91
[23] S. Morishita and J. Sese. Traversing lattice itemset with statistical metric
pruning. In Proc. of the 19th ACM SIGMOD-SIGACT-SIGART Symposium on
Principles of Database Systems (PODS 2000), pages 226 236, 2000.
[24] P. C. Nguyen, K. Ohara, H. Motoda, and T. Washio. Cl-GBI: A novel approach
for extracting typical patterns from graph-structured data. In Proc. of the
9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data
Mining (PAKDD 2005), pages 639 649, 2005.
[25] K. Ohara, P. C. Nguyen, A. Mogi, H. Motoda, and T. Washio. Constructing
decision trees based on chunkingless graph-based induction. In L. B. Holder and
D. J. Cook, editors, Mining Graph Data, pages 203 226. Wiley-Interscience,
2006.
[26] J. Park, M. Chen, and P. Yu. An effective hash-based algorithm for mining
association rules. In Proc. of the 1995 ACM SIGMOD International Conference
on Management of Data, pages 175 186, 1995.
[27] J. Pei, J. Han, and R. Mao. Closet: An efficient algorithm for mining frequent
closed itemsets. In Proc. of the 2000 ACM-SIGMOD International Workshop
on Data Mining and Knowledge Discovery, pages 11 20, 2000.
[28] J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M. C. Hsu.
PrefixSpan: Mining sequential patterns efficiently by prefix projected pattern
growth. In Proc. of the 17th International Conference on Data Engineering [ Pobierz całość w formacie PDF ]
zanotowane.pl doc.pisz.pl pdf.pisz.pl karpacz24.htw.pl
TABLE 4.5 Database for Exercise 3
TID Items
T01 Cheese, Milk, Egg
T02 Apple, Cheese
T03 Apple, Bread, Cheese, Orange, Grape
T04 Bread, Egg, Orange
T05 Cheese, Milk, Grape
T06 Apple, Cheese, Egg, Orange
T07 Bread, Cheese, Orange
T08 Cheese, Egg, Grape
T09 Bread, Cheese, Egg, Grape
T10 Bread, Cheese, Grape
© 2009 by Taylor & Francis Group, LLC
References 89
TABLE 4.6 Sequence Database
for Exercise 10
SID Transaction Sequences
S01 (bc)(d)(ab)(def )
S02 (abc)(cf )(df )
S03 (ce f )(df )(ab)( f )
S04 (be)(ac)(cd f )
10. Given the sequence database shown in Table 4.6, find frequent sequential pat-
terns by AprioriAll for minsup = 0.5.
References
[1] http://www.borgelt.net/apriori.html.
[2] http://www.cs.bme.hu/
[3] R. Agrawal and R. Srikant. Fast algorithms for mining association rules. In
Proc. of the 20th International Conference on Very Large Data Bases (VLDB
1994), pages 487 499, 1994.
[4] R. Agrawal and R. Srikant. Fast algorithms for mining association rules. IBM
Research Report RJ9839, IBM Research Division, Almaden Research Center,
1994.
[5] R. Agrawal and R. Srikant. Mining sequential patterns. IBM Research Report
RJ9910, IBM Research Division, Almaden Research Center, 1994.
[6] R. Agrawal and R. Srikant. Mining sequential patterns. In Proc. of the 11th
International Conference on Data Engineering (ICDE 1995), pages 3 14, 1995.
[7] T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Sakamoto, and S. Arikawa.
Efficient substructure discovery from large semi-structured data. In Proc. of
the 2nd SIAM International Conference on Data Mining, pages 158 174, 2002.
[8] C. Blake and C. Merz. UCI repository of machine learning databases, 1998.
http://www.ics.uci.edu/~mlearn/MLRepository.html.
[9] F. Bodon. Surprising results of trie-based fim algorithms. In Proc. of the IEEE
ICDM Workshop on Frequent Itemset Mining Implementations (FIMI 04),
volume 126 of CEUR Workshop Proceedings, 2004. http://ftp.informatik.
rwth-aachen.de/Publications/CEUR-WS/Vol-126/bodon.pdf.
[10] C. Borgelt. Efficient implementations of Apriori and Eclat. In Proc. of the IEEE
ICDM Workshop on Frequent Itemset Mining Implementations (FIMI 03),
© 2009 by Taylor & Francis Group, LLC
90 Apriori
volume 90 of CEUR Workshop Proceedings, 2003. http://ftp.informatik.
rwth-aachen.de/Publications/CEUR-WS/Vol-90/borgelt.pdf.
[11] S. Brin, R. Motwani, J. D. Ullman, and S. Tsur. Dynamic itemset counting and
implication rules for market basket data. In Proc. of ACM SIGMOD Interna-
tional Conference on Management of Data (SIGMOD 1997), pages 255 264,
1997.
[12] D. W. Cheung, J. Han, and C. Y. Wong. Maintenance of discovered association
rules in large databases: An incremental updating technique. In Proc. of the
1996 ACM SIGMOD International Conference on Management of Data, pages
13 23, 1996.
[13] D. J. Cook and L. B. Holder. Substructure discovery using minimum description
length and background knowledge. Journal of Artificial Intelligence Research,
Vol.1, pages 231 255, 1994.
[14] G. Dong, X. Zhang, L. Wong, and J. Li. Caep: Classification by aggregating
emerging patterns. In Proc. of the 2nd International Conference on Discovery
Science (DS 99), LNAI 1721, Springer, pages 30 42, 1999.
[15] B. Goethals. Survey on frequent pattern mining, 2003. http://www.adrem.
ua.ac.be/bibrem/pubs/fpm survey.pdf
[16] J. Han, H. Cheng, D. Xin, and X. Yan. Frequent pattern mining: Current status
and future direction. Data Mining and Knowledge Discovery, Vol. 15, No. 1,
pages 55 86, 2007.
[17] J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate gen-
eration. In Proc. of the 2000 ACM SIGMOD International Conference on
Management of Data, pages 1 12, 2000.
[18] J. Han, J. Pei, Y. Yin, and R. Mao. Mining frequent patterns without candidate
generation: A frequent-pattern tree approach. Data Mining and Knowledge
Discovery, Vol. 8, No. 1, pages 53 87, 2004.
[19] A. Inokuchi, T. Washio, and H. Motoda. General framework for mining frequent
subgraphs from labeled graphs. Fundamenta Informaticae, Vol. 66, No. 1-2,
pages 53 82, 2005.
[20] K. Kailing, H. Kriegel, and P. Kroger. Density-connected subspace clustering
for high-dimensional data. In Proc. of the 4th SIAM International Conference
on Data Mining, pages 246 257, 2004.
[21] W. Li, J. Han, and J. Pei. Cmar: Accurate and efficient classification based
on multiple class-association rules. In Proc. of the 1st IEEE International
Conference on Data Mining (ICDM 01), pages 369 376, 2001.
[22] B. Liu, W. Hsu, and Y. Ma. Integrating classification and association rule
mining. In Proc. of the 4th International Conference on Knowledge Discovery
and Data Mining (KDD-98), pages 80 86, 1998.
© 2009 by Taylor & Francis Group, LLC
References 91
[23] S. Morishita and J. Sese. Traversing lattice itemset with statistical metric
pruning. In Proc. of the 19th ACM SIGMOD-SIGACT-SIGART Symposium on
Principles of Database Systems (PODS 2000), pages 226 236, 2000.
[24] P. C. Nguyen, K. Ohara, H. Motoda, and T. Washio. Cl-GBI: A novel approach
for extracting typical patterns from graph-structured data. In Proc. of the
9th Pacific-Asia Conference on Advances in Knowledge Discovery and Data
Mining (PAKDD 2005), pages 639 649, 2005.
[25] K. Ohara, P. C. Nguyen, A. Mogi, H. Motoda, and T. Washio. Constructing
decision trees based on chunkingless graph-based induction. In L. B. Holder and
D. J. Cook, editors, Mining Graph Data, pages 203 226. Wiley-Interscience,
2006.
[26] J. Park, M. Chen, and P. Yu. An effective hash-based algorithm for mining
association rules. In Proc. of the 1995 ACM SIGMOD International Conference
on Management of Data, pages 175 186, 1995.
[27] J. Pei, J. Han, and R. Mao. Closet: An efficient algorithm for mining frequent
closed itemsets. In Proc. of the 2000 ACM-SIGMOD International Workshop
on Data Mining and Knowledge Discovery, pages 11 20, 2000.
[28] J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M. C. Hsu.
PrefixSpan: Mining sequential patterns efficiently by prefix projected pattern
growth. In Proc. of the 17th International Conference on Data Engineering [ Pobierz całość w formacie PDF ]