Higgs Hunters Talk

Simulated data?

  • Turkwise by Turkwise

    I'm seeing a whole lot of images that are telling me that they are simulated data. Is this data that has been there from the beginning, but the site is just now telling us what images are or are not simulated? Or is this new?


  • koranzite by koranzite scientist

    Some of the data is simulated, some of it is real data. This is so we can check how good you all are at spotting vertices that have been artificially added. This is why you have been seeing so many potential vertices, as we want to train you to be really good at spotting them so when you stumble upon one in real data (which are quite rare) you will spot it easily.

    It has been this way from the beginning, we only recently decided to make it clear that some were simulated so you'd know when you might have actually made a discovery in the real data 😃


  • Ptd by Ptd

    Hi koransite

    Sprinkling a few in so we can get our eyes in yes, your needing to understand/calibrate our sensitivity to ocvs also very much yes, but 9 out of 10 images being simulations? I feel that even experienced zooites will very quickly lose interest if that continues.

    Also please could the Science Team say in each case what is being simulated, for example particle of type A breaking down into particles B, C & D of which the detector could actually see only C & D, at location E? I recognize that some of us (for example me) aren't really going to understand what those particles are just from the name, but it would be interesting to know, definitively, what some of what we are looking at is.



  • delete_9875306e by delete_9875306e

    Eh, let me just say it. First, this whole idea was interesting enough to do several hundreds of shots. After the first two hundred, I even signed on in order to remove the annoying pop-up after each 5 shots, and the eyes learned to ignore all interface except data and a couple of buttons. So I missed one little thing, but before I say it, here is what motivated me to do all the above: "Uncover the building blocks of the universe. Help search for unknown exotic particles in the LHC data."
    Then I saw what I learnt to not see: "Thank you. Your classification has been recorded. This was simulated data that we show volunteers in order to calibrate the project."
    Thank you for luring me into wasting my time on free QA or debugging of some simulator. Not interested.


  • JeanTate by JeanTate in response to koranzite's comment.

    For what it's worth, I think the use of simulated data/simulations needs to be handled very, very carefully.

    That it is important, vital even, for there to be simulations is true; zooites' classifications cannot really be used - to do science - unless they are properly calibrated, and simulations are very good for calibration.

    And other - very successful - Zooniverse projects have used sims, e.g. Andromeda Project, Space Warps.

    However, one of the first uses of sims, perhaps the first*, started just as you described: zooites were presented sims and were not told they were sims immediately after finishing a classification. The outrage was intense.

    So, my suggestion: re-write the introductory material, tutorials, etc to include examples of sims, a detailed explanation of why they are needed, and a clear statement about how often you are likely to come across one. If the majority of images to classify contain sims, you might want to carefully rethink how to explain this project. I believe that collider_spider's sentiment will be typical, with many a zooite being considerably less happy.

    *for oldbie zooites, the mention of "fake AGNs" can still generate a great deal of anger.


  • undefined by undefined

    yes I'm very sorry but I also have to admit that its very disappointing for me to know I invested many hours into simulated images.
    But I understand simulated images are required to adjust the project but its still just disappointing.
    And my motivation drops since I have to assume that many (or even all) of the "nice" and "interesting" images I found/saw are just simulated...


  • Marcuandy1981 by Marcuandy1981

    This is only the training. Do you want to make bad classifications on the real data? Or do you want to learn first how to classify it properly?


  • JeanTate by JeanTate in response to Marcuandy1981's comment.

    If the experience on other Zooinverse projects where sims/calibration/whatever images have been used is a reliable guide, many/most zooites are OK with 'clicking to calibrate' ... provided it is explained, clearly, why such an activity is needed; AND provided they are informed that what they just classified was, in fact, a training/calibration/sim.

    Of course, some zooites are not really into this kind of exercise; which is fine, because there are plenty of Zooniverse projects which do not have such images.

    However, what - historically speaking, per what happened in other Zooniverse projects - really gets zooites mad (in general) is not being told, ahead of time. Worse, being led to believe that the images/objects they are classifying are real (i.e. not sims).


  • undefined by undefined

    and one other "problem" I see is if they want to train us with simulated images it would be nice when we can get a "feedback" or something like a solved image of the simulated one we just classified to see if we missed something or marked something wrong or did it right.


  • JeanTate by JeanTate in response to undefined's comment.

    This is a 'devil and the deep blue sea' kind of problem (as I understand it).

    On the one hand, the human brain is a superlative pattern-recognition machine, and some brains are exceptional (for a given task, e.g. picking real supernovae among the 'look alikes', or spotting asteroids; being exceptional at one task does not necessarily make you exceptional for a different task); on the other hand, humans have well-known blindnesses, and feedback may 'train' zooites to look for certain rare events, but also train them to ignore others.

    The trade-off between too much specification and feedback, and too little, must surely be hard to decide (at least in a Zooniverse project like this one).


  • DZM by DZM admin

    Hey everyone, the team is aware that way, way too many simulated images are being presented. There's a bug report in, and we're going to work on getting that fixed. I'm fairy sure that we do not want to be presenting nothing but sims!

    Sorry, and promise, we're working on it!


  • de.hawkeye by de.hawkeye

    I understand the need for calibration. On the other hand it is quite frustrating to not work on real data. Also I now ask myself: Am I doing things right. Unfortunately there is no feedback. It would be nice to know the right answer afterwards. Otherwise its hard to become better. I guess I will just make the same mistakes over and over again.

    As a side note. I quite liked the stardust@home approach. They had a test object every now and then. And for each bad answer they reduced your overall score.



  • JeanTate by JeanTate


    Will we be able to go back over the images already classified, and see - once the bug has been fixed - which were sims and which not?

    If I recall correctly, in other Zooniverse projects, there were two approaches to this: a) when you finished, you were told if an image was a sim or not, but unless you commented on it, that fact/message never re-appeared (for you); b) once classified, you could review an image - later (e.g. in Favorites) - and you'd know that it was a sim or not.

    How does it work in Higgs Hunters?



    well stated thread. i would also submit that a proper video would do wonders for clarification more so than diagrams. Floating Forests (which imo is a far easier project) has one (youtube). A volunteer on Asteroid Zoo put one up himself on youtube which gave greater understanding.


  • koranzite by koranzite scientist

    Please see this thread for a full explanation of the problem and what we're doing to fix it: http://talk.higgshunters.org/#/boards/BHH0000004/discussions/DHH00001e4


  • tashipoo by tashipoo in response to JeanTate's comment.

    You're right. As long as we know they are sims - no problem.( And if they don't come 10 in a row). I did a few hundred thousand classifications on Spacewarps and the sims were clearly labeled and infrequent enough not to cause irritation. We were also clearly informed of their purpose. Everything has a learning curve and we are all learning as we go.


  • rlb66 by rlb66

    I am new, I am getting a lot of SIMS, can I get the correct answer for the sims so I can correct myself, in future, as necessary?