Wednesday, 15 August 2012

algorithm - Data mining: Apriori issue. Min-support -



algorithm - Data mining: Apriori issue. Min-support -

i wrote info mining apriori algorithm, works on little test info having issue run on bigger info sets.

i trying generate rules of items bought frequently.

my little test info 5 transactions , 10 products.

my big test info 11 1000000 transactions , around 2700 products.

problem: min-support , filter non frequent items. lets imagine interested in items frequency 60% or more. frequency = 0.60;

when compute min-support little info set 60% frequency algorithm remove items bought less 3 times. min-support = numberoftransactions * frequency;

but when trying same thing big info set, algorithm filter item set after first iteration, couple of items able meet such plane.

so i've started decreasing plane lower , lower, running algorithm many times. not 5% giving desired results. had lower frequency percents until 0.0005 @ to the lowest degree 50% of items involved in first iteration.

what think current situation might info problem, since generated artificially? (microsoft adventure works version) or code or min back upwards computation problems?

maybe can offer other solution or improve way of doing this?

thanks!

maybe how info like.

if have lot of different items, , few items per transaction, chances of items co-occurring low.

did verify result, incorrectly pruning, or algorithm correct, , parameters bad?

can name itemset apriori pruned shouldn't have pruned?

the problem is, yes, choosing parameters hard. , no, apriori cannot utilize adaptive threshold, because wouldn't satisfy monotonicity requirement. must utilize same threshold itemset sizes.

algorithm data-mining apriori

No comments:

Post a Comment