Test the Empire! Need your input!

Discussion in 'Community Discussion' started by Aikar, Feb 6, 2017.

  1. So, JUnit finds it's way to the Empire? ;)

    With all due respect: although this system might help to prevent some common bugs I don't see this helping you in the longer run to prevent those nasty bugs you speak off. If you really want to get serious about squashing bugs then you should consider getting a team of testers together who can try things out for themselves and report back their findings.

    For the simple reason that testing things for breakage does not involve simply going over a predefined set of routines and/or commands to see if things still work. First of all the obvious: if you guys add new things or change existing stuff then obviously this may result in things no longer working as us players would expect. Us players and a predefined set of test routines I might add. But that does not imply a bug or breakage per definition.

    But most of all: the process of testing does not solely evolve around trying commands to see if they still work. It all evolves about bringing things together (trying to) and trying out several combinations of things. You try something, you see the results and then use to what you just saw to try out other combinations.

    The most important thing for a beta tester is the ability to work outside predefined boundaries which allows you to do things which others didn't anticipate up front. And that's something you simply can never automate, because the variables for this will constantly change with every new feature or change you make to the system.

    I know I brought this topic up a few times in private before, to which I was told that you guys already have such a test team consulting of staff and (optionally) the build team. Yet as also mentioned there I still think that logic is flawed per definition.

    For the simple reason that the first thing you want from a tester is to have as little ties with your project as possible. Because the more involved someone is with a project the higher the risks that they will overlook the obvious things. It's not just dry theory, it's a given. I'm talking with 10+ years worth of experience within this field :p

    But I'm also basing myself on several of the things I've already found and reported to you guys already (you know what I'm talking about). Most of those issues did not evolve around one command which stopped working, but about combing things.

    Example: when I write up documentation for a customer project then I always let my girlfriend proof read it. For the simple reason that I'm so caught up in the project that it's a given that I will overlook obvious typos and/or mis-usage of words. For the simple reason that I usually work for days on such a project and in the end I can almost recite the thing from mind. As in: I'll know exactly what it should say and where it should say it, which also means that you'll very easily overlook silly mistakes because you know what it should read.

    The classic use of double double (<== intended) words or repetition of those, etc..

    And it's not just within this field. The same applies to software development, redstone circuitry, audio design, compositions, and even Minecraft map making.

    If you want to get serious about tests then I honestly think you should give this some consideration. Get players involved who you can trust to keep quiet about new features, and allow them to test these things. But without involving them in the development process itself.

    As always just my 2 cents :)
    RainbowPony, Zrugite, 607 and 3 others like this.
  2. Unlikely that unit testing would be new to EMC. Unit tests are simplistic and often based on vital functionality.
    The point of this it to avoid the hassles of a testing team, but still see if they can add essentials to their testing. This isn't about squashing bugs, it is about making sure no bugs are deployed (something that doesn't work during development is not a bug, its just broken).
    You mentioned unit testing earlier. The point of unit testing is to create a suite of tests that should always pass in the foreseeable future. If Mojang changes a feature, and a test fails, then the test is adjusted. But if EMC implements something, and one of these tests fail, that says "hey, you broke something that you likely didn't intend to break!" So yes, you do want simplistic tests that cover trivial things. It is often the trivial things that cause the most harm. For instance, things like the recent netherhound bug would have been caught by unit testing had tests been created.
    This isn't true. It is fairly simple to randomize command and environment usage. It can be resource intensive and time consuming (for the script), but it is possible and it is used extensively in industry.

    Beta testers aren't typically reliable. A beta tester will try to break things, try to show things work, or attempt to validate that the product works as described. But they aren't good at it in small numbers. And large numbers are too unrealistic for EMC. The current team that tests is far better than a beta team, solely due to the fact that they have access to previous bug reports and know the details that regular members cannot.

    Another thing to mention is that bugs that repeat themselves are the ones to worry about. If you strung together three unrelated commands and your game crashed, that doesn't matter. It is an unrealistic scenario and a waste of time and resources. A product should meet the set out requirements. Requirements usually take the form of "a user should be able to do blah blah blah blah detail detail detail", not "a user should not be able to find a bug."
    Testing should consist of an internal knowledge team (the developers) and also those who don't have internal knowledge. Being on the staff team doesn't mean they have internal knowledge. The build team especially. If you don't know how the code works, then you are a sufficient tester. Yes, those not on each team would also be good, but not necessarily better.
    Again, this is easily automated through randomization.
    Aikar is an expert in his field, the idea that a form of proofreading would help is not exactly logical. No one is perfect, but if you are very good at what you do, it is hard to find someone unconnected with what you are doing who can actually aide you in a form of proofreading. You example of grammar doesn't apply in a situation such as this; you and your girlfriend may be in different professions but your ability to use language is likely similar.
    It should be obvious that he is serious about testing. He is essentially data mining. He knows exactly what he thinks the game should do, but he realizes he may not have the same view as everyone else. He is keeping the actual process closed but is still generating ideas. This is exactly what most companies (especially app makers) need.

    A dedicated, non-staff testing team would be helpful in the situation where there are rapid deployments and not enough time to let the Dev team test. That team would be tasked with finding critical bugs. But as it is, Aikar is looking to make the testing phase easier by introducing more automated tests for things not often thought about. This will let him focus more on development, rather than have to worry about netherhound issues when he is working on something that shouldn't have an effect on them.
    The_Boulder likes this.
  3. In no particular order -
    • Villagers going *poof*.
      • This may actually be an existing bug in the wild, but all I have for evidence is the villagers that go *poof* at Fe(II). No one can kill or egg them (they're enclosed in protected blocks), so I'm at a loss why 2-12 will sometimes vanish.
    • Vanishing "domesticated" animals, especially in the wild. Baaah. We never wanna breed sheep again.
    • The opposite problem - The frontier suddenly being flooded with Love Potion #9 and the animals breed (or magically spawn) themselves at insanely fast rates. Though thanks for the free steaks and chops!
    • The TNT safeties.
    • The fire safeties.
    • PvP damage outside the PvP arena, including while on transport. Like setting me and my horse on fire with an arrow (--coughs-- Ulti <3).
    • Antigrief features, and flist/vouching features.
    • Death by Momo glitches. I'd like never to be frozen again please in the ground and unable to even select a tool/weapon while he stomps all over me. It's been really nice not having those for 2 years :)
    • Don't break the redstone circuitry or the pistons for the love of the gods please.
    I have to be up in 6 hours for work, so I have to leave off for the moment. I wish you and the devs the best on this, thanks!
    NuclearBobomb and 607 like this.
  4. Please note the smiley in there :)

    I think you didn't read my comment thoroughly.

    Sure it's easy to randomize commands but that's generally speaking a waste of resources, as you mentioned yourself. Which is where a beta tester can be an invaluable asset simply because he can cut through the overhead and pinpoint specific targets. Beta testing is much more than merely entering in random commands, as you seem to think it is :)

    Untrue. If that was the case then people who are good at it wouldn't manage to get paid so well. Quantity doesn't make quality, as always.

    That line of thinking is what's getting us all those remote and local exploits which you see popping up more and more recently. Just saying, because yah: look up the recent issues with the OpenBSD httpd daemon, I think you might spot a parallel.

    I appreciate your comment but I doubt that you know the context which I was hinting at. Unless of course you're an alt who did see all that but yah, that's something I obviously can't say for sure :)

    Untrue. It's not about knowing how the code works, it's about being familiar with the end product which can make or break a test procedure. Of course within the context of beta testers.

    I seriously think you didn't bother to read my comment thoroughly, so I'm leaving it here.

    Please note the 'Example' remark I made. I wasn't suggesting that Aikar should proof read, I gave an example why being (very) familiar with an end product can make or break a test procedure, but within the field of writing documentation. I used this example because I assumed it would be easiest for people to understand.
    SkareCboi likes this.
  5. Not using junit, as junit is compile time testing, and this is runtime testing.

    I'm not sure I even want to explore the idea of loading junit into runtime as then that'll be shipped with the plugin jar which is very uncommon to do.

    Current test suite stuff looks like: https://gist.github.com/aikar/f10f71143c45ddc13efb79d6ffac5f23

    Also, please focus on EMC specific behavior, not vanilla items.

    It's way too difficult to try to test vanilla items. Not even Mojang tests their own code! I surely am not going to start for them lol.
    Zrugite, 607, MCSaw and 2 others like this.
  6. Beat testing is just people using a product. Not really much more.



    Beta testing is a far worse waste of resources. A simulated environment that randomizes commands can power through thousands of scenarios a night, while beta testers will move at a snails pace. The only benefit to a beta tester is that they provide their own environment, but given that EMC is a server, this really doesn't help much. Even a general work laptop can power through commands, so it might not be necessary to even use the servers for randomized testing.

    Just because it is randomized testing doesn't mean it is entirely random. If Aikar logged the use of commands for a day and looked at their frequency, he would have more than enough information to set up an automated environment that could outclass human testers (at least command wise). The large time use would be actually creating this system.

    Regardless, my point was that we don't need beta testers because near all critical bugs are found before releases as is, and the teams that have access to Stage already provide enough data. It could even be argued that the SMPs are beta servers as is.

    Adding in a dedicated beta team would result in more pauses to the development process. It would have to be consistently used to be useful, and Aikar is anything but consistent (as in, he has varying free time to work on development that can't be scheduled accurately (not a bad thing)).

    Beta testers aren't (generally) paid. Testers are paid. There is a big difference, and that difference is entirely knowledge based.

    If you are talking about bug bounty hunters, that isn't comparable to a beta tester. Again, knowledge is key to their worth.
    Not really a critical issue, unless those systems are used in a life critical system. They are not, as regulations restrict what can be used in such critical systems, and open source software is not exactly well known for that.

    Now, the openssl bug from a few years ago would be considered a critical bug. Even then, it wouldn't have been reasonable to dedicate the resources to finding the bug itself.

    I am not denying the existence of annoying bugs. But annoying bugs are not usually an issue that halts production. The in-flight entertainment system of an airplane doesn't have to work for the software system to be deployed, because it doesn't critically affect the operation of the plane. Engine controls, on the other hand, do because they are relatively important.
    I am not sure why the context matters. The principle is still the same.
    Beta testers aren't blindly given a product. There is an expectation that they are given the basic knowledge needed to start toying with the product. The idea is that they can use it without interference.

    If a beta tester would know any internal information about a product (other than basics), then they aren't a beta tester. They are now regular testers.
    I understood what you were saying, hence why I said a "form of proofreading." I also understand why you used the example you did, but it is not applicable in this situation due to that it contradicts your point. You and whoever is proofreading for you are, in general, at the same level. Beta testers, by definition, are not.

    You are correct in saying that Aikar's familiarity with the end product hinders his ability to see all the faults with it. But you are wrong in assuming that the staff and build teams know enough to also be hindered by this. They do know enough to be blind to some things, but as we have seen, these things are rarely critical issues.
    The_Boulder likes this.
  7. Items on the ground not grouping together (this happened after 1.9, and currently is still broken in that only items that came from breaking blocks group together, not items players threw)
  8. Any particular items? First thing I tried grouped fine.
  9. Ok..... well... a few things you might want to test against based on previous major updates:
    • The mob exp differences between emc and vanilla
      When 1.8 hit all of the Guardians falling to their doom dropped exp without being touched by the player. Also the various changes that EMC has implemented with afk exp farming (dogs, pigmen).
    • Biome Changes when version updating
      How many wither, witch, and slime farms were lost between 1.7 and 1.8
    • Renamed Monsters
      I can see this easily being broken with various upcoming updates. Currently, as far as I'm aware, each player is only allowed a certain amount on renamed monsters, allowing them to stay present without disappearing.
    • Villagers and mobs in town vanishing mysteriously
      This was mentioned before, but is definitely worh mentioning agian.
    • How EMC works alongside the various mods allowed
      This is harder to test, but definitely worth looking into. There is one known bug regarding that is currently in play. I know EMC has turned off various features to other allowed mods because of the advantage it gives (mob radar on voxel, printer mode on schematica, etc). I highly doubt your code can test each mod with this, and Shell's idea of a "beta" team might be viable, however this is a something to always be on the watch for.f
    • Villager mechanics
      Villager mechanics, specificallly how they interact with an active village, are very touch and go on EMC. They have been played with very heavily vs vanilla... with good reason. However, one update... could easily break this.
    • Difficulty level
      Depending on the code, any changes to mob AI could drastically effect any changes with difficulty level. This referes to exp dropped, special mob aggro levels, damage taken/given, anything, everything else that this specifically changes.
    607, ThaKloned and Silken_thread like this.
  10. I recall previous shenanigans with mob and player heads, their skins, and stacking.

    Spawn rules for Momentus and Marlix, and the effective level stuff that governs their drops, which was quite error-prone at one point.
  11. When you're in an area for some time mobs stop spawning even when not AFK! requires logging out for a few minutes for chunk to reset.
  12. Its more it the blocks are say broken and than you drop the same to group. that's what I find.
  13. I asked Aikar about this when we updated. It's not a bug and is an intended feature. Something about making it so others can't pick up your blocks if in a group or something.
  14. Yes the item grouping is intended, but I did make some recent changes about that but not sure if I deployed it yet.

    I changed it so broken blocks can still merge into owned items.
  15. Try naming them/putting something in thier helmet slot.
  16. Entity limit might be killing them off. Happened at my iron farm before I fixed it.
  17. INB4 fall damage on in town again...
    607 likes this.
  18. If un-following a player made their follower count go down by 2, that would be pretty bad. ;)
    We3_MPO likes this.
  19. we don't have followers in game :p
    607 and We3_MPO like this.
  20. Actually, it's quite applicable and a good example. Whether the two proofreaders have the same level of grammar and language comprehension is really irrelevant, as long as they are both competent. The point is that sometimes when proofreading, a completely competent person can read the same thing multiple times and not catch an error. Sometimes the same person, just by taking a break and returning later will catch mistakes they didn't before. Likewise, having multiple competent people proofread can be just as effective, if not more so. Even people of similar levels of competency can have very different experiential backgrounds. Even your diet, exercise, and sleep habits, or your current mood or goals can affect how you perform at such a task.

    The same principles apply to bug testing. As long as they are competent, having people with diverse perspectives and backgrounds is ideal. Human behaviour can still be hard to predict with algorithms. That's one of the reasons why most software companies make use of outside testers.