Announcement

Collapse
No announcement yet.

What is Snapshot Validator / Snapshot Health Questions

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What is Snapshot Validator / Snapshot Health Questions

    Hi,

    I'm kicking the tires on Rx Pro 11.2 (2019-07-11 Build 2704700334) and W10 x64. (This is a clean install of OS and RX with no other sw installed) and I have a few questions.
    I've not used any 11.x version thus far and have been using 10.x and server 2.x.


    Q1: Does running the Snapshot Validator do something more than if I were to just run a defrag?
    For example - does defrag do (defrag + validate)? or are they completely separate functions?

    I as this because some in the past I was told to do a defrag to fix some issues, so is validate a subset of defrag or is there more to it?


    Q2: In prior RX builds ( in the 10.x days pro and server 2.x versions as well) I saw a LOT of RED snapshots which I was told meant they were corrupted snapshots in some way and should not be trusted to roll back to. This forced me to make several manual attempts to re-make the intended snapshot and in some cases it was so faulty that the only way to make a "user" snapshot was to force a clean reboot so a scheduled snapshot could be made with no files being locked (which I assume is part of the problem) and then rename and lock it for later use.

    Now with 11.2 on this clean w10 install with only Rx installed, I made 3 snapshots and then ran the validator as a test to see what it does on the 3rd one, I was then was surprised that the validator showed "failed to repair" then "repaired".

    Why would a brand new install of OS and RX start showing failed/suspect snapshots right out of the gate? Have the RED issues of the past not been resolved?
    In the past (with red line items) I had wished that you would have given an option to retry X times or offer to do a rebooted snapshot for me but save the name and lock state or baring that just abort the creation of the snapshot if you could not validate it on its initial creation, having a bad snapshot is the same as not having one.

    This is important to me because I mostly use the systray to make snapshots and when you do that you don't get to see the line item and can easily assume your snapshot is good not verify its state before making any system changes.




    Q3: Why is the validator a separate program and not just a function in the main GUI?


    Thanks.







    exampleValidator.jpg
    Attached Files

  • #2
    Mate....I'd like to help but i dont trust that module....period.

    Comment


    • #3


      I did some more testing over the 3 day weekend and have some more observations.
      The test is on the same clean installed box where the system was set to continuous rebooting after a 40 sec delay at the desktop.

      1. The system was able to reboot 315 times before running into an issue. "Automatically delete unlocked.... if free space below" was set to 10000.
      On the 315th reboot the system hung in the RX splash screen (not sure what the official name for this screen is) with the message "Failed to find free checkpoint" (see attached)
      I was able to hit enter and get it to keep rebooting a few more times then it happened again.

      This error does not seem to be related to free space as there was some still left and only doing rebooting means each snapshot was very small.
      I tried a manual RX defrag and a few more reboots but the error kept happening.

      Only until I deleted snapshots (by setting KEEP NO MORE THAN x SNAPSHOTS") would the system respond normally.

      Is a bug that a high number of snapshot causes this error or is something else going on?


      2. Before deleting the 300+ snapshots I tried the validator tool. It seems that with this many snapshots when you click VALIDATE the validate button grays out for a bit then becomes enabled again but no actual validations seem to happen. So it seems that there is a bug doing validate with more than X snapshots


      3. After deleting the snapshots I chose to run the validator tool again and like before, there are a LOT of "Failed to repair"
      In fact for the X number of snapshots, the first X (~14) validated and then it seems that the remaining Y just kept failing to repair, almost like either all the following snapshots were actually bad or a bug in the validator that once I hit the 1st bad that every further one is reported as bad regardless of the actual return value of the check - are they really bad or a bug - who knows?

      Again I have to ask why would so many snapshots be corrupted if in case that is the case on a brand new install with only RX installed?

      Even if I don't use the validator tool how can one be assured that a snapshot can be restored correctly?



      4. The validator tool should have a SELECT ALL option.




      Thanks.






      failed_to_find_free_checkpoint.jpg validation failures example.jpg

      Comment


      • #4


        I did some more testing over the 3 day weekend and have some more observations.
        The test is on the same clean installed box where the system was set to continuous rebooting after a 40 sec delay at the desktop.

        1. The system was able to reboot 315 times before running into an issue. "Automatically delete unlocked.... if free space below" was set to 10000.
        On the 315th reboot the system hung in the RX splash screen (not sure what the official name for this screen is) with the message "Failed to find free checkpoint" (see attached)
        I was able to hit enter and get it to keep rebooting a few more times then it happened again.

        This error does not seem to be related to free space as there was some still left and only doing rebooting means each snapshot was very small.
        I tried a manual RX defrag and a few more reboots but the error kept happening.

        Only until I deleted snapshots (by setting KEEP NO MORE THAN x SNAPSHOTS") would the system respond normally.

        Is a bug that a high number of snapshot causes this error or is something else going on?


        2. Before deleting the 300+ snapshots I tried the validator tool. It seems that with this many snapshots when you click VALIDATE the validate button grays out for a bit then becomes enabled again but no actual validations seem to happen. So it seems that there is a bug doing validate with more than X snapshots


        3. After deleting the snapshots I chose to run the validator tool again and like before, there are a LOT of "Failed to repair"
        In fact for the X number of snapshots, the first X (~14) validated and then it seems that the remaining Y just kept failing to repair, almost like either all the following snapshots were actually bad or a bug in the validator that once I hit the 1st bad that every further one is reported as bad regardless of the actual return value of the check - are they really bad or a bug - who knows?

        Again I have to ask why would so many snapshots be corrupted if in case that is the case on a brand new install with only RX installed?

        Even if I don't use the validator tool how can one be assured that a snapshot can be restored correctly?



        4. The validator tool should have a SELECT ALL option.




        Thanks.






        validation failures example.jpg

        Comment


        • #5
          Hiya,

          first question: The validator checks the integrity of the snapshot with regards to VSS, Registries and Boot Records and tries to correct, usually the validator does fix things so that the questionable snapshots changes status to "Good"

          Second Question: I think its best to understand what a questionable snapshot is so firstly...

          -Here's what a "Questionable snapshot is defined as:
          A. The system was not good when the snapshot was taken, so a bad snapshot
          B. The system was good, therefore a good snapshot but later turned bad.


          Scenario B is something that the team here has to look into as it usually linked to a bug or something similar, either way when version 11 came out we created adetailed logging system for every snapshot activity to figure out why this happens.

          Scenario A, on the other hand, is pretty much the primary reason why this tool was created in the first place as consistency is an important aspect for the program.

          Rollback Rx Checks the following (In detail)

          1. Whether VSS was used successfully in creating the snapshot. The VSS primary function in the program is to make sure that all data is getting committed to the HDD. A lot of issues people have could simply be because they arent flushing the system cache (Unchecking the option when taking a snapshot)

          2. Checks the Boot Records, UEFI, and BCD if they are good. This gets compared to the last consistent snapshot.

          3. Lastly windows Registries.

          So what i would do is before perhaps installing Rollback Rx you might want to consider running a scan for all three of those things to see if there's an issue and then install Rollback Rx to see if the snapshots are questionable again, if that is the case than we will need to take a look at it further.

          As for the validator issue, its interesting and i'll have to bring it up with the team. I will let you know what the verdict is on that.



          Question 3: As im sure you're aware the validator, although an interesting tool, has its flaw so we thought that we can iron out the bugs before becoming part of the main GUI, the plan is indeed though to add it in at a later point once we feel comfortable enough with it.

          Let me know your thoughts!

          ​Thank you


          Comment


          • #6

            Ram,

            Thank you for your reply.

            Can you comment on what “Failed to find free checkpoint” means? On my same test system, I reduced the setting to keep no more than 10 snapshots and still I get the error so it is not related to free space or the number of snapshots but it is preventing the system from finishing a reboot which is bad when doing automation and expecting the system to come back from a non-fatal error. Of course I am only assuming that “Failed to find free checkpoint” is a non-fatal error and is just a bug on your part as I can always just click OK and the system will continue to boot.

            Attached are the test settings I am using.


            Thank you





            current settings.jpg

            Comment


            • #7
              Im not entirely sure offhand but im currently testing exactly this scenario out on a test machine and i will post results.

              Comment


              • #8
                So just an update.

                The test machine is a windows 10 Pro Mini PC with an SSD

                I've created 278 snapshots. They were all good, i still ran the snapshot validator anyways.

                Up to about...30 snapshotss? The program couldn't validate anymore, every time I tried to, the program would refresh but it wouldn't validated. I made a video too as documentation purposes if you guys want to see it. The snapshots were good though which was a bit different compared to OP issue.

                We're gonna do analysis on this now and hopefully correct this regardless but thought i let you guys know.

                Thanks

                Comment


                • #9
                  Also when i rebooted, i got the free checkpoint error as well. So atleast that is possible to replicate.

                  Comment


                  • #10
                    Thanks for the update Ram....

                    Comment


                    • #11
                      We got a new build in the works thats correcting this issue. We should have it out pretty soon but i cant say exactly when because we're still testing it out.

                      Thanks

                      Comment

                      Working...
                      X