So, back in May I released v3.3.0, which was a huge new release as the linked article explains. And, when you make a lot of changes, and typically, fixes and improvements, sometimes, you do things wrong. Sometimes its more serious than others.
The main driver behind this new release of v3.3.1 was to fix a problem that developed in v3.3.0 with regard to exporting, as HTML, very large volumes of data. I’m talking hundreds of thousands, or millions, of rows. Ironically, I thought I had actually fixed a memory and stability problem, but in fact, I made it worse!
In brief, I used the Database Grid call “RecordCount” to check how many rows there were and if it was a lot, it would use filestreams, or if it was not a lot, RAM. I used this call because I thought it gave me, erm, the record count. Turns out it does do that of course, but only a count of the records ON SCREEN! I didn’t notice this until I dug deeper into the documentation of the function.
This was a problem because I was using it at the start as my SaveFILESToHTML function to help Quickhash decide whether it was to use RAM to do the export, which it would do for less than 20K rows. Or whether to use a filestream to do the export, which it would use for rows in excess of 20K rows. The reason I didn’t realise it wasn’t working when I made these changes earlier in 2021 for v3.3.0 was just because, by chance, when I was testing it on row counts of say 50K rows, the RAM of my machine could still handle it, so it just saved the file anyway, and I assumed it was all working. I never ventured to try it on half a million rows. When it takes 2 hours for my machine to hash 407K files, and given I have done it about 20 times over the last few weeks while debugging code recently, hopefully some of you can relate as to why, sometimes, my testing does not rank as highly as real world usage.
Anyway, so this comes to light in the real world thanks to a user who reported the problem to me when he, sadly, had spent 5 hours computing the hashes of 400K rows, only to find Quickhash would crash when he attempted to save the results as HTML. And it happened to him twice! So my error really wasted his time – sorry for that.
So, I have spent the last few weeks making changes to ensure this is resolved. And I have done so by instead using dedicated TSQLQueries, rather than using the Database Grid technology, which, it transpires, was never designed for that volume of data. As a result, in my (many) tests of hashing 407K files, I can now save the output to HTML in under 10 seconds using TSQLQueries. The output file is huge of course – 56Mb, and Firefox takes about two minutes to open it!
Then of course I discovered the same issue was hitting output to CSV text files as well! So I had to make the same changes for that too.
In addition to this fix, which is the biggest fix of note, I have also made some other minor corrections. You can see which ones in the Issues tracker over on Github, looking at the closed ones, and several others of my own which I have identified over the last few months.
I hope this version works better for folks who hash large numbers of files and need to save the output as HTML.
Tanks a lot, your work to keep the software updated is much appreciated! Best regards.
Marco