Register
Page 1 of 3 123 Last
Results 1 to 10 of 29
  1. #1

    [Scientific] Advancing Sales Detection

    Hi,

    Now since I study computer science I thought that one could take a more scientific approach to sales detection.

    I've learned a lot about neural networks (http://en.wikipedia.org/wiki/Neural_network) and one of their main purpose is to classify things. For example you can classify plants by several values for key features such as width and length of leaves.

    Now I thought one could train a neural network to detect sales if you feed it values for: how many cheaper auctions were available at the last scan, what is the cheapest price at the last and current scan, was there a bid on the item, how much time was left etc.

    So the first step would be: identify key values which might play a role in the decision and find corresponding aggregation functions.

    The second step would be to create lots of data to train the network: i.e. combination of input values plus expected result (i.e. expired, cancelled, sold)

    From there one would train the NN with those input values, create a new set of input values not shown to the NN before and verify if its accurate.

    After that one could play with various parameters to get better results and finally use the output to decide what to display. One could set a limit: i.e. require 80% confidence to classify as sold.

  2. #2
    A large enough real-world training dataset would be hard to simulate. This would work best if there was a module in, say, the TSM package that would allow sold auction data to be uploaded to a central db (a la Gatherer), which would be used to train the NN (or a MCMC bayesian/bootstrap algorithm).
    Retired - I blame Kathroman for everything.

    (that's a joke, eh)

  3. #3
    Xsinthis's Avatar
    Posts
    639
    Reputation
    17
    Tagged in
    208 Posts
    Add to this user's reputation
    Yea this'd require a lot of volunteers to run something that gathers their sold auctions data. Than again, Auctioneer and MySales already record these things, so maybe we already have a large sample?

    Either way I'd be willing to volunteer data for this.

  4. #4
    Timehealzall's Avatar
    Posts
    10
    Reputation
    0
    Tagged in
    0 Posts
    Add to this user's reputation
    Question/observation: the db for gatherer is a compilation for those in a guild who use it. It includes all farmed professions. At least I think it is. If so, would not a larger sample be available by gathering the db from as many guilds on any given server? Also... with the new guild system for rewards, isn't this type of data (fish caught, ore, herbs, etc.) compiled by Blizz and could it be resident on your PC/Mac? If anything, it may be something to look into.
    Let's nerf druid healers some more. NOT!!!

  5. #5
    Well, it wouldn't need too many volunteers - 20 or 30 servers with maybe 5 people covering different markets for maybe a couple weeks. It's just that simulating the training dataset means that the accuracy of the predictions is only as good as the assumptions made in the simulation. So, a real-world training dataset would be best. It's just a matter of grabbing it.

    If it can be included as a module in TSM via the API, it becomes a bit easier to get the data. I think Sapu is already looking at a beancounter type of module, but in any case you're talking about a lot of work on the coding and infrastructure...

    EDIT: The above dataset would be useful for a lot of things, not just this hypothetical project.
    Last edited by Stede; March 4th, 2011 at 03:38 PM.
    Retired - I blame Kathroman for everything.

    (that's a joke, eh)

  6. #6
    Right, I wonders what the dataset mentioned by Timehealzall has to do with sales on the AH.

    Well I could monitor my own sales, but I play on a EU Server so no TUJ-data to compare them to. I could still compare them to my own armory-AH scans.

    I'm not sure if MySales tracks all the info needed; You could infer expired and cancelled auctions from the time left so missing those is not a problem; I just don't know if it saves the auction ids.

    So yeah after thinking some more about which information might be of value to the decision I'll go check where one could get those from.

  7. #7
    Athkatla's Avatar
    Posts
    88
    Reputation
    5
    Tagged in
    0 Posts
    Add to this user's reputation
    You've got the right method applied to the wrong problem. Sales detection isn't a complicated problem; 9 times out of 10, you can solve it like this:
    An auction present in Scan1 is not present in Scan2. Where'd it go?
    -If the auction was the cheapest (at that stack size) in Scan1, it probably sold.
    -If all the auctions below this one in cost have also disappeared, it probably sold.
    -Otherwise, it was probably cancelled.

    The issue is that this data is unreliable when you're talking about a volatile market and infrequent scans. Scan more frequently and the data gets better.

    Now, the more interesting learning problem (that I'm working on) is this: I'm considering posting this auction. What is the probability that someone will buy it before someone else undercuts it? Once it has been undercut, how much does the chance of a sale decrease? Using that information, you can estimate the gross value of the transaction used to post the auction.

  8. #8
    I have thought about exactly those rules you posted but I've come up with a situation or two where they either need to be extended or don't work, i.e. reposting the cheapest item with a higher buyout under certain conditions.

  9. #9
    Quote Originally Posted by Amandria View Post
    I have thought about exactly those rules you posted but I've come up with a situation or two where they either need to be extended or don't work, i.e. reposting the cheapest item with a higher buyout under certain conditions.
    That's one exception.

    Another - stack size. The cheapest auctions are not always the ones that get bought.
    Another - Auction gets bought, but at a price that is undercut between scans. Now we have a sale at a price higher than what's seen in the subsequent scan. By the rules given, a sale is not recorded.
    There's probably more. Erorus and I had a few weeks of back & forth emails over sold auction data back in the Autumn. It was a mess to sort through. Short of a predictive algorithm, I don't see any way to get accurate sold auction data from the data collected by TUJ.
    Last edited by Stede; March 4th, 2011 at 04:03 PM.
    Retired - I blame Kathroman for everything.

    (that's a joke, eh)

  10. #10
    Athkatla's Avatar
    Posts
    88
    Reputation
    5
    Tagged in
    0 Posts
    Add to this user's reputation
    Yeah, I've considered another rule: if a seller has N auctions in Scan1 and N auctions in Scan2, but none of those auctions share auctionids, consider them cancels rather than reposts.

    The problem with detecting cancels based on reposts is that someone might cancel some auctions, get up to walk the dog or make tea, then come back and repost them - in the mean time, you've probably scanned several times. How far back in your scan history do you look to match reposts?

    The final problem with that sort of rule is that you'll miss all the sales by someone who has several of an item in stock, but only posts one on the AH at a time. Since I do that frequently myself, I don't really like this rule.

 

 

Similar Threads

  1. Advancing the Gathering queue
    By phelps in forum General TradeSkillMaster Discussion
    Replies: 2
    Last Post: June 17th, 2011, 09:34 AM
  2. Bot detection - 100% false positives
    By thc1967 in forum General TradeSkillMaster Discussion
    Replies: 2
    Last Post: June 7th, 2011, 07:29 PM
  3. Detailed Sales Recording
    By Incarnate in forum Archive (Addons and Macros)
    Replies: 6
    Last Post: March 26th, 2011, 12:15 PM